We recommend using Git-based projects in Domino if you have at least a beginner’s proficiency with Git, experience using hosted version control (GitHub, Bitbucket, etc.), and if you frequently collaborate with multiple data scientists on projects. To learn more about common workflows using Git and branching, we recommend the following resources: Git Feature Branch Workflow.
If you’re not familiar with Git or hosted version control, or if you’re primarily looking to reproduce data science research, then a project that uses the Domino File System (DFS) may be a better fit for you.
Git-based projects are a beta feature. If you’d like to provide feedback on Git-based projects, please file a ticket.
Git-based projects provide a full Git experience for your code by using Git and a Git service provider of your choice. All of the common Git workflows, like committing, pushing changes, and more, are available to you natively within workspaces launched in Git-based projects. This makes it easy for you to engage in version controlled, code-based collaboration with fellow project team members, all from within Domino. Git-based projects also organize your projects’ assets as either Code, Data, or Artifacts, an organizational structure intended to support common data science workflows.
Create a Git-based project¶
If you plan on using a private Git repository to store your code, then you’ll first need to add the corresponding Git credentials in your Domino account settings prior to creating your project. After adding credentials, you’ll be able to easily create a Git-based project in Domino.
If you plan on using a public Git repository to store your code, then you won’t need to add any Git credentials.
To create a Git-based project:
- Click on Projects in the Domino sidebar menu and then click the New Project button. A project creation modal will appear.
- Enter a name for your project.
- Set your project’s visibility.
- Under “Code Repository”, select “Git (beta)”.
- Under “Git Repository”, enter the URL of the Git repository that you’ll be using for your project.
If the repository you’re using to store your code contains one or more files exceeding 2 GB in size, then Domino will create your Git-based project but will not use the repository for your project. Please note that your Git service provider may also impose size limits on individual files.
You can use the following tool to check the total size of a Git repository, as well as the size of individual files within the repository: git sizer.
- Under “Git Credentials”, select the credentials associated with the Git repository your project will use. If the repository is public, then you won’t need to enter any credentials.
- Under “Git Service Provider”, select your Git service provider.
- Click Create Project.
Code, Data, & Artifacts¶
In Domino, the Domino File System (DFS) is the traditional way of storing a project’s assets. DFS-based projects organize all of your project’s assets as either Data or Files. Git-based projects, however, organize your project’s assets as either Code, Data, or Artifacts.
Code – This section of your Git-based project organizes and lists all of the Git-based repos used to store your project’s code, as well as any additional imported repositories. Files within any of these repositories can be accessed from within a Domino workspace. The common Git workflows, like committing, pushing, pulling, and more, are available to you when interacting with your code from within a Domino workspace. In addition, the default working directory for your code will be /mnt/code. For more information, see Using Git in your workspace.
Data – Similar to DFS projects, this section of your Git-based project organizes and lists all data sources used in your project, including Domino datasets, external data volumes, and dataset scratch spaces. For more information on how to use data with your project, please refer to the Domino datasets documentation.
Artifacts – Git-based projects introduce “Artifacts”. Artifacts are typically results or products from your research and analysis, like plots, charts, serialized models, and more. You can organize these outputs in this section, as well as import artifacts from other projects.
If you run a Job in a Git-based project, only artifacts will be automatically synced and saved to the Domino File System (DFS). Code, on the other hand, will not be automatically synced / pushed to the Git repo being used for the Git-based project. This is intentional and intended to support the “Code”, “Data”, and “Artifacts” workflow. To learn more, see running jobs.
Working with artifacts in your workspace¶
All files in Artifacts are saved exclusively to the Domino File System (DFS). If you do not want to save a particular asset to the Domino File System, we recommend that you do not save it as an artifact. To learn more, see Syncing your work to Domino.
Artifacts are results from your research, like plots, charts, serialized models, and more. In Domino, you can save these results in the Artifacts section of your project.
Saving artifacts and pushing changes
- Click on the File Changes option in the sidebar of your workspace.
- Under “Artifacts”, view changes by expanding “File Changes”.
- Enter a commit message.
- Click Sync to Domino. Domino will save your artifacts to the Domino File System (DFS).
To pull the latest artifacts (from the Domino File System) into your workspace:
- Click the File Changes option in the sidebar menu of your workspace.
- Under the “Artifacts” section, click Pull. Domino will pull the latest changes into your workspace.