domino logo
Tech Ecosystem
Get started with Python
Step 0: Orient yourself to DominoStep 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Get started with R
Step 0: Orient yourself to Domino (R Tutorial)Step 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Get Started with MATLAB
Step 1: Orient yourself to DominoStep 2: Create a Domino ProjectStep 3: Configure Your Domino ProjectStep 4: Start a MATLAB WorkspaceStep 5: Fetch and Save Your DataStep 6: Develop Your ModelStep 7: Clean Up Your Workspace
Step 8: Deploy Your Model
Scheduled JobsLaunchers
Step 9: Working with Domino Datasets
Domino Reference
Notifications
On-Demand Open MPI
Configure MPI PrerequisitesFile Sync MPI ClustersValidate MPI VersionWork with your ClusterManage Dependencies
Projects
Projects OverviewProjects PortfolioReference ProjectsProject Goals in Domino 4+
Git Integration
Git Repositories in DominoGit-based ProjectsWorking from a Commit ID in Git
Jira Integration in DominoUpload Files to Domino using your BrowserFork and Merge ProjectsSearchSharing and CollaborationCommentsDomino File SystemCompare File Revisions
Revert Projects and Files
Revert a FileRevert a Project
Archive a Project
Advanced Project Settings
Project DependenciesProject TagsRename a ProjectSet up your Project to Ignore FilesUpload files larger than 550MBExporting Files as a Python or R PackageTransfer Project Ownership
Domino Runs
JobsDiagnostic Statistics with dominostats.jsonNotificationsResultsRun Comparison
Advanced Options for Domino Runs
Run StatesDomino Environment VariablesEnvironment Variables for Secure Credential StorageUse Apache Airflow with Domino
Scheduled Jobs
Domino Workspaces
WorkspacesUse Git in Your WorkspaceRecreate A Workspace From A Previous CommitUse Visual Studio Code in Domino WorkspacesPersist RStudio PreferencesAccess Multiple Hosted Applications in one Workspace Session
Spark on Domino
On-Demand Spark
On-Demand Spark OverviewValidated Spark VersionConfigure PrerequisitesWork with your ClusterManage DependenciesWork with Data
External Hadoop and Spark
Hadoop and Spark OverviewConnect to a Cloudera CDH5 cluster from DominoConnect to a Hortonworks cluster from DominoConnect to a MapR cluster from DominoConnect to an Amazon EMR cluster from DominoRun Local Spark on a Domino ExecutorUse PySpark in Jupyter WorkspacesKerberos Authentication
On-Demand Ray
On-Demand Ray OverviewValidated Ray VersionConfigure PrerequisitesWork with your ClusterManage DependenciesWork with Data
On-Demand Dask
On-Demand Dask OverviewValidated Dask VersionConfigure PrerequisitesWork with Your ClusterManage DependenciesWork with Data
Customize the Domino Software Environment
Environment ManagementDomino Standard EnvironmentsInstall Packages and DependenciesAdd Workspace IDEsAdding Jupyter Kernels
Use Custom Images as a Compute Environment
Pre-requisites for Automatic Custom Image CompatibilityModify the Default Workspace ToolsCreate a Domino Image with an NGC ContainerCreate a Domino Environment with a Pre-Built ImageManually Modify Images for Domino Compatibility
Partner Environments for Domino
Use MATLAB as a WorkspaceUse Stata as a WorkspaceUse SAS as a Workspace
Advanced Options for Domino Software Environment
Publish in Domino with Custom ImagesInstall Custom Packages in Domino with Git IntegrationAdd Custom DNS Servers to Your Domino EnvironmentConfigure a Compute Environment to User Private Cran/Conda/PyPi MirrorsUse TensorBoard in Jupyter Workspaces
Publish your Work
Publish a Model API
Model Publishing OverviewModel Invocation SettingsModel Access and CollaborationModel Deployment ConfigurationPromote Projects to ProductionExport Model ImageExport to NVIDIA Fleet Command
Publish a Web Application
App Publishing OverviewGet Started with DashGet Started with ShinyGet Started with FlaskContent Security Policies for Web Apps
Advanced Web Application Settings in Domino
App Scaling and PerformanceHost HTML Pages from DominoHow to Get the Domino Username of an App Viewer
Launchers
Launchers OverviewAdvanced Launcher Editor
Assets Portfolio Overview
Model Monitoring and Remediation
Monitor WorkflowsData Drift and Quality Monitoring
Set up Monitoring for Model APIs
Set up Prediction CaptureSet up Drift DetectionSet up Model Quality MonitoringSet up NotificationsSet Scheduled ChecksSet up Cohort Analysis
Set up Model Monitor
Connect a Data SourceRegister a ModelSet up Drift DetectionSet up Model Quality MonitoringSet up Cohort AnalysisSet up NotificationsSet Scheduled ChecksUnregister a Model
Use Monitoring
Access the Monitor DashboardAnalyze Data DriftAnalyze Model QualityExclude Features from Scheduled Checks
Remediation
Cohort Analysis
Review the Cohort Analysis
Remediate a Model API
Monitor Settings
API TokenHealth DashboardNotification ChannelsTest Defaults
Monitoring Config JSON
Supported Binning Methods
Model Monitoring APIsTroubleshoot the Model Monitor
Connect to your Data
Data in Domino
Datasets OverviewProject FilesDatasets Best Practices
Connect to Data Sources
External Data VolumesDomino Data Sources
Connect to External Data
Connect to Amazon S3 from DominoConnect to Azure Data Lake StorageConnect to BigQueryConnect to DataRobotConnect to Generic S3 from DominoConnect to Google Cloud StorageConnect to IBM DB2Connect to IBM NetezzaConnect to ImpalaConnect to MSSQLConnect to MySQLConnect to OkeraConnect to Oracle DatabaseConnect to PostgreSQLConnect to RedshiftConnect to Snowflake from DominoConnect to Teradata
Work with Data Best Practices
Work with Big Data in DominoWork with Lots of FilesMove Data Over a Network
Advanced User Configuration Settings
User API KeysDomino TokenOrganizations Overview
Use the Domino Command Line Interface (CLI)
Install the Domino Command Line (CLI)Domino CLI ReferenceDownload Files with the CLIForce-Restore a Local ProjectMove a Project Between Domino DeploymentsUse the Domino CLI Behind a Proxy
Browser Support
Get Help with Domino
Additional ResourcesGet Domino VersionContact Domino Technical SupportSupport Bundles
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining
User Guide
>
Domino Reference
>
Projects
>
Git Integration
>
Git-based Projects

Git-based Projects

Note

Domino supports Git-based projects. Git-based projects:

  • Ensure common Git workflows are available natively in your workspace.

  • Provide an easy way to engage in version-controlled, code-based collaboration with fellow project team members within Domino.

  • Organize your projects' assets as either code, data, or artifacts to support common data science workflows.

Consider Git-based projects if you:

  • Have a beginner’s proficiency with Git.

  • Have experience with hosted version control systems like GitHub and Bitbucket.

  • Collaborate with several data scientists on projects.

To learn more about common workflows using Git and branching, see Git Feature Branch Workflow.

If you are not familiar with Git or hosted version control systems, or you want to reproduce data science research, then consider a Domino File System (DFS) project.

Note

Before you use a private Git repository to store your code, you must add the corresponding Git credentials to your Domino account settings. After you add the credentials you can create a Git-based project in Domino.

If you use a public Git repository to store your code, you do not have to add your Git credentials.

Code, Data, and Artifacts

In Domino, the Domino File System (DFS) is the traditional way to store project assets. DFS-based projects organize all your project’s assets as either data or files. Git-based projects organize your project’s assets as either code, data, or artifacts.

Code

This section organizes and lists all the Git-based repositories used to store your project code and additional imported repositories. For more information, see Git repositories in Domino. Use common Git workflows to access files in these repositories.

gbp-15

Common Git workflows, like committing, pushing, and pulling are available when you interact with code in a Domino workspace. For more information, see Git in your Workspace.

The default working directory for your code is /mnt/code.

You can select any branch and the last 10 commits. You can browse the folders of linked Git repositories natively from the Code section on the Project page.

Note
Data

Similar to DFS-based projects, this section organizes and lists all data sources used in your project, including Domino datasets, external data volumes, and dataset scratch spaces. For more information about how to use data with your project, see Datasets Overview.

Artifacts

Artifacts are results or products from your research and analysis like plots, charts, and serialized models. Organize these artifacts in this section and import artifacts from other projects.

gbp-1

Directory structure

Git-based projects use a different directory structure in workspaces than DFS-based projects. The following shows the directory structure.

The default working directory for your code is /mnt/code.

/mnt
│
├── /code   # Git repository and default working directory.
│
├── /data
│   │   # Project Datasets
│   ├── /{dataset-name}   # Latest version of dataset.
│
│    # Project Artifacts
├── /artifacts
│
│    # External mounted volumes
├── /{external-volume-name}
│
└── /imported
    │   # Imported Git Repos
    ├── /code
    │   └── /{imported-repo-name}
    │
    ├── /data
    │   │   # Mounted Shared Datasets
    │   └── /{shared-dataset-name} # Contains contents of latest snapshot unless otherwise specified by yaml.
    │
    │    # Imported Project Artifacts
    └── /artifacts
        └── /{imported-project-name}
Important

Create a Git-based project:

  1. In the navigation pane, click Projects and then click New Project.

    gbp-2

  2. In the Create New Project window, enter a name for your project.

  3. Set your project’s Visibility.

  4. Click Next.

    gbp-13

  5. Under Hosted By, click Git Service Provider. On selection, more fields will be shown beneath the Hosted By field.

  6. Under Git Service Provider, select the provider currently hosting the repository you want to import. This is the target repository.

  7. Under Git Credentials, select credentials authorized to access the target repository.

  8. Under Repository, select a repository or enter a Git URL. If you are using PAT credentials with Github or GitLab, you can create your own repository.

  9. Click Create

    gbp-14

    Important

During the project creation process, you can create a new repository for Github and GitLab. (These are the only Git providers currently supported by Domino.)

Create a new repository

  1. Click Create new repository under Repository.

  2. Select the Owner/Organization associated with the repository, its Visibility, and specify the name for the new repository.

Develop models in a workspace

A Domino workspace is an interactive session where you can conduct research, analyze data, train models, and more. Workspaces let you work in the development environment of your choice, like Jupyter notebooks, RStudio, VS Code, and many other customizable environments.

Note

Switch branches in your workspace

Branches allow you to develop features, fix bugs, or safely experiment with new ideas in a contained area of your repository. You can switch branches easily inside your workspace for both the main code repository and any additional imported repositories. Maximum of 10 branches will be listed in the drop-down, in alphabetical order for local branches followed by remote branches. If your repositories have more than 10 branches, you can type to search for additional branches.

gbp-16

Resolve merge conflicts

Merge conflicts occur when competing changes are made to the same line of a file, or when one person edits a file and another person deletes the same file. You can resolve merge conflicts inside your workspace through guided UI for both the main code repository and any additional imported repositories.

Sync changes

When syncing changes in your workspace to remote Git repositories, we will first fetch the latest content from the remote branch (git fetch), then commit local changes on top of the updated branch (git rebase), and finally try to push the commit/s to the remote. When files are in conflict, you can choose to either resolve manually or force my changes. Force my changes will overwrite remote files with changes in your workspace. This means that the commit history on the remote will match the commit history in your workspace.

gbp-18

Pull changes

When pulling latest changes from remote to your workspace, we will first fetch the latest content from the remote branch (git fetch), then apply your changes on top of the updated branch (git rebase). When files are in conflict, you can choose to either resolve conflicts manually or use remote changes. Use Remote Changes will discard changes in your workspace and overwrite files in your workspace with remote changes.

gbp-17

Resolve manually

When Resolve Manually is selected, you can resolve conflicts by the filename. For each file in conflict, you can choose to Mark as resolved, Use my changes or Use origin repo changes:

  • Mark as resolved assumes that you’ve edited the files to resolve conflict markers. The latest change of the file will appear under Uncommitted changes and be pushed to remote when you continue sync.

  • Use my changes will overwrite remote files with changes in your workspace. The latest change of the file will appear under Uncommitted changes and be pushed to remote when you continue sync.

  • Use origin repo changes will discard changes in your workspace and overwrite the file with remote changes. The file won’t appear under Uncommitted changes because there is no change to commit. However, you still need to click Continue sync to complete the conflict resolution.

gbp-19

Multiple commits

When you are pushing multiple commits, conflicts can happen at any commit, and you will be guided through our UI to resolve conflicts by commits. Keep in mind, the latest commit might overwrite changes you made during conflict resolutions.

Work with artifacts in your workspace

Important
All files in Artifacts are saved exclusively to the Domino File System (DFS). If you do not want to save a particular asset to the Domino File System, we recommend that you do not save it as an artifact. To learn more, see Syncing your work to Domino.

Artifacts are results from your research, like plots, charts, serialized models, and more. In Domino, you can save these results in the Artifacts section of your project.

Saving artifacts and pushing changes

  1. Click File Changes in the navigation pane of your workspace.

  2. Under Artifacts, view changes by expanding File Changes.

  3. Enter a commit message.

  4. Click Sync to Domino. Domino will save your artifacts to the Domino File System (DFS).

    gbp-8

Pull changes

Pull artifacts into your workspace

  1. Click the File Changes option in the sidebar menu of your workspace.

  2. In the Artifacts section, click Pull. Domino pulls the latest changes into your workspace.

    gbp-9

Run jobs

Warning
If you run a job in a Git-based project, Domino only synchronizes and saves artifacts to the Domino File System (DFS). In Git-based projects, you must manually sync code or push it to the Git repository. This is intentional and supports the code, data, and artifacts workflow. To learn more, see running jobs.
Domino Data LabKnowledge BaseData Science BlogTraining
Copyright © 2022 Domino Data Lab. All rights reserved.