Orchestrate with Flows

Domino Flows enables efficient orchestration and monitoring of complex, interconnected multi-step processes while ensuring full lineage and reliable reproducibility. The processes, implemented as Domino jobs, are tasks and the complete structure of connections between tasks is a workflow. Tasks produce outputs that become inputs to other tasks, forming the basis for the connections. A Flow definition constructs a DAG (directed acyclic graph).

Note	To support reproducibility, each task must be side-effect free by reading versioned inputs, and writing defined outputs.

When to use Flows

Flows is flexible enough to declaratively model arbitrarily complex processes. Dependency relationships between tasks determine the order in which they run and whether they can be parallelized. Scenarios spanning machine learning, data engineering, and data analytics benefit from this level of control and reproducibility.

For instance, Flows would be an ideal choice for scenarios like:

Executing a data processing workflow in Dask prior to a training workflow in XGBoost
Running a clinical study pipeline by loading SDTM datasets to produce ADaM datasets and TFL reports
Collecting image metadata from S3 with Spark and performing model inference with PyTorch
Loading financial data from Snowflake, processing it for use in a Ray training job that registers a model in MLflow
Processing a local protein database to search for a nucleotide sequence and generating a scatterplot

Flows may not be the most appropriate choice to use when modeling a process that accesses a single dataset and performs many small computations in a homogenous environment. Tasks that write to mutable shared state (like read-write datasets) cannot be used in Flows, but can be made compatible with modifications.

Flow tasks vs standalone Domino Jobs

Flows extends the Domino Job system with key new functionality including:

Programmatic Python based authoring of versioned, reusable, repeatable, immutable workflows
Strongly typed definitions of inputs being consumed and outputs being produced for each task
Automatic lineage and versioning of all task and workflow inputs and outputs
Heterogeneous, isolated environment support for any task
Stronger reproducibility requirements and guarantees
Visualization of the workflow execution graph and the ability to inspect and monitor each task, its inputs and outputs
Parallel execution of tasks at scale
Configurable caching and task result reuse anywhere within the workflow
Flow Artifacts for discovery, inspection and reuse of specially annotated outputs within a project
Automatic recovery from intermittent failures and manual recovery of partial executions

Flows is built on the open-source framework Flyte.

Important terms

Some key terms to understand before getting started with Flows include:

Term	Definition
Task	Tasks are the core building blocks within a flow and are isolated within their own container during an execution. A task maps to a single Domino Job.
Flow	A flow is a composition of multiple tasks or other flows (called subflows). Flows can be triggered through a single command and are tracked as a single, fully reproducible entity.
Node	A node represents a unit of execution or work within a flow (they show up as individual blocks in the graph views). A node can contain either a single task or a whole flow (called subflows).
Task inputs	Task inputs are strongly typed parameters that can be defined on individual tasks. Inputs allow tasks to be rerun with different settings through the UI, without the need to modify the code itself. Inputs can be read and used within executions.
Task outputs	Task outputs are strongly typed parameters that define the results that are produced by a task. Outputs are tracked and stored in discrete blob storage, so that they can be used as input to another task.
Flow inputs/outputs	Flow inputs/outputs are similar to the task inputs/outputs but are defined at the flow level. Inputs defined for a flow can be passed into relevant tasks, and outputs from tasks can be returned as the overall output for a flow.

Term

Definition

Task

Tasks are the core building blocks within a flow and are isolated within their own container during an execution. A task maps to a single Domino Job.

Flow

A flow is a composition of multiple tasks or other flows (called subflows). Flows can be triggered through a single command and are tracked as a single, fully reproducible entity.

Node

A node represents a unit of execution or work within a flow (they show up as individual blocks in the graph views). A node can contain either a single task or a whole flow (called subflows).

Task inputs

Task inputs are strongly typed parameters that can be defined on individual tasks. Inputs allow tasks to be rerun with different settings through the UI, without the need to modify the code itself. Inputs can be read and used within executions.

Task outputs

Task outputs are strongly typed parameters that define the results that are produced by a task. Outputs are tracked and stored in discrete blob storage, so that they can be used as input to another task.

Flow inputs/outputs

Flow inputs/outputs are similar to the task inputs/outputs but are defined at the flow level. Inputs defined for a flow can be passed into relevant tasks, and outputs from tasks can be returned as the overall output for a flow.

Next steps

See Get started with Flows to understand the key concepts before you get started with Domino Flows.
Define Flows via a code-first approach using Flyte’s Python SDK.
Explicitly define Flow Artifacts in your code.
Once Flows are defined, you can register and launch them.
Use the comprehensive Domino Flows user experience to monitor Flows.
After you have defined Flow Artifacts, you can examine them.
Find out how every flow, task, and execution are uniquely versioned in Domino Flows to guarantee reproducibility.
Learn more about the advanced capabilities that you can use in Domino Flows.

User Guide

Admin Guide

API Guide

Release Notes

Orchestrate with Flows

When to use Flows

Flow tasks vs standalone Domino Jobs

Important terms

Next steps