As data science organizations grow, collaboration becomes critical for speeding up research and development. Many of the highest-performing data science organizations cite collaboration as part of their “secret sauce” that allows them to rapidly innovate and deliver impact from data science work.
“Collaboration” is a vague term that means different things to different people. Domino enables collaboration in several ways:
-
Collaborative development: Domino makes it easy for multiple people to work on the same project together.
-
Communications and shared context: Domino helps collaborators stay informed about what each other is doing, making teamwork more streamlined and creative.
-
Make work accessible: Domino makes it easy to share work and provides tools to make information more accessible to less technical stakeholders.
-
Knowledge management: Domino makes it easy to find and reuse work.
-
Use Domino search: Domino provides a search feature to easily locate specific files or pieces of text across your entire deployment.
-
Access controls and collaboration: Different entities in Domino can be permissioned separately to control who is allowed to find and use specific entities.
-
Manage organizations: Domino makes it easy to give many users permission to your project at once and to add/remove collaborators from multiple projects.
A Project is Domino’s fundamental unit of collaboration. A Project is a bundle of all the materials and context that collaborators use when working on a project together. This includes the code, files (e.g., results or artifacts), data, software packages, and configuration — as well as the historical discussion and record of activity within the Project.
Projects make collaboration easy through two simple but powerful characteristics:
Single source of truth for materials across collaborators. All the materials you use in data science projects — the code, compute environments, data, and hardware — are centralized and thus identical regardless of who is using them. As a result, Domino solves one of the common problems of collaboration: different team members have different versions of code, software, or data out of sync with each other. Gone are the problems of passing files back and forth between computers or checking versions of packages across different dev environments.
Automatic tracking of work avoids “writing over” someone else’s work. Domino’s Reproducibility Engine ensures that work cannot be lost or permanently overwritten. Moreover, git-backed projects make it easy for collaborators to work on multiple branches concurrently, for easier organization of changes.
These modes of collaboration drive several benefits, which often differentiate the highest-performing data science organizations.
-
Innovation and breakthroughs. Much data science work is like scientific research in that exploratory ideas may be paused and put on the shelf. Being able to find past work about a topic, and being able to “page in” the context of past work, can accelerate the process of sparking new ideas. One data science leader we work with told us that all of the most impactful breakthroughs on his team came from someone picking up old work and looking at it in a new way.
-
Productivity. When data scientists spend less time on mechanics of collaboration — coordinating file changes with each other, and browsing through multiple systems for prior work — they have more time for higher-value work like research and model development.
-
Less duplicate work and analytics “debt”. A common problem in large organizations is the proliferation of duplicate work, e.g., ten different teams might make a model to do the same thing. This duplication of work becomes a drag on continuous improvement, as different versions diverge in subtle ways, and improvements to one of them aren’t automatically propagated to others. By making it easier to reuse past work, Domino helps simplify and minimize your surface area of data science work, increasing maintainability and the speed of continuous improvement.
-
More flexible talent strategies. As a centralized platform (accessible through the browser) that doesn’t require any work on local desktops or laptops, Domino makes remote teams possible, unlocking access to talent in more locations. It also makes work from home much easier. Offering these options can help attract top data scientists, who are in high demand and often have their own choice of employers.
-
More flexible project staffing models. By making it easy to have multiple people working on a Project together, Domino enables teams to decompose work into different components that can be developed in parallel, to complete projects faster. For example, within a Project, some people could be working on feature engineering tasks while others could be working on visualization and reporting outputs, while others could be working on data prep. Domino provides a single interface to share and integrate multiple parts of a project.