Domino understands that reproducibility is vital to scientific research and data science development. To ensure reproducibility, Domino captures the context of a Job (storage logs, hardware specifications, environment details, and metadata) as well as the inputs and outputs (code, input parameters, data source details, local files, charts, tables, reports, and models).
Domino monitors the context, inputs, and outputs of every Job so you know everything that went into a Job so you can audit and reproduce the results.
Git is used to help ensure the reproducibility of Job code and associated files.
In a Domino File System project, all Job input and output files, including code files, are automatically versioned as part of a new Git commit that includes the unique run ID in the commit description. Assets can be compared across different Jobs in Domino.
Files are not automatically saved when a Job is run from a Git-based project. In most cases, the code that the Job executes does not change, and adding a commit to Git would clutter the external Git repository. However, there are scenarios where output files or code need to be saved. To save files from a Job in a Git-based project:
Use the artifacts folder - Files written to artifacts (/mnt/artifacts/) will be automatically synced and saved to the Domino File System.
Programmatically sync/push files to Git using the code being executed by the Job.
Using MLflow to log experiment metrics and other key metadata is a best practice in Domino.
Domino implements MLflow and hardens the MLflow experience by providing role-based access controls (RBAC) on top to limit access to sensitive information.
For more information, see Track and monitor experiments.
The Jobs dashboard provides a way to view all Jobs in a given project. It also displays key metrics and has options to customize views. Think of it as your collaboration and reproducibility hub for Domino Jobs.
Domino stores versioned key Job components to ensure reproducibility. Job details allow you to dive into the actual assets stored by Domino for each Job. To learn more, see View Job details.
Domino also provides a way to capture and visualize Job metrics. While it is a best practice to use Domino’s implementation of MLflow for this task, capturing these metrics in Domino remains a popular option. Users can log metrics in both places if they prefer. To learn more, see Customize Jobs dashboard.