Domino Jobs are designed to execute and orchestrate code logic in a fully reproducible manner. Whereas Domino Workspaces are for interactive development, Jobs are for batch/headless workloads. Ideal when you need a fully contained and complete reproducibility package.
The result is a highly reproducible execution environment capable of handling a wide range of tasks including data processing, analytics, and machine learning. Jobs can handle single, simple tasks as well as complex, multi-task workflows with intricate dependencies.
Domino offers numerous ways to run Jobs:
-
Launch from the UI: Execute Jobs directly from the Jobs UI for a low-code option.
-
Job scheduler: Use the Job scheduler to easily run Jobs on-demand or at regular intervals.
-
Domino CLI: Execute Jobs from the Domino CLI. This is typically used for iterative local development. For example, when using an IDE that can’t run in Domino, or you don’t want to use Domino compute resources to launch a Job.
-
Domino API: Run Jobs using the Domino API. Often used when integrating with an external pipeline tool to launch Jobs using triggers.
Key information about each Job is automatically captured to ensure the reproducibility and auditability of work. When you start a Job, Domino launches a new Environment for your code on the target hardware tier.
You can start multiple concurrent Jobs and each Job gets its own container environment, so you can try multiple parameters and techniques in parallel.
Monitor Jobs for a complete picture of your Jobs at various levels, from individual Job details to the broader view of how metrics evolve. Job monitoring supports reproducibility and auditability by capturing extensive Job context, including storage logs, hardware specifications, environment details, inputs, outputs, metadata, and custom metrics.
Quickly view your results directly in the Domino UI or customize the results views including notifications, reports, and custom emails to display your results elsewhere.
Depending on the specific needs and nature of your work, you may want to use Workspaces instead of Jobs:
It’s best to use a Job under the following circumstances:
-
Model training & large-scale computations — If you need to train models on a large dataset or run any sort of long-running, computationally intensive tasks.
-
Batch processing — If you have tasks you want to run in the background and/or in parallel.
-
Reproducibility — If strict reproducibility is important, Jobs provide a guarantee that the same execution Environment can be re-created.
-
Automation — If the task needs to be run on a regular schedule or is part of a retraining or automated workflow.
It’s best to use a Workspace under the following circumstances:
-
Exploratory data analysis (EDA) — If you are in the initial stages of a data science Project and need to explore and analyze your data interactively.
-
Rapid model iterations — If you want to iterate rapidly on model development, hyperparameter tuning, or feature engineering and need rapid feedback with the ability to make instant adjustments.
-
Code development & debugging — If you need an Environment to write, test, and modify code in real-time and see results immediately.