Make sure you have a good understanding of the key concepts before you get started with Domino Flows.
This section demonstrates a basic example that:
-
Takes two integers as an input to a flow.
-
The first task adds the integers together and passes the result as an input to the second task.
-
The second task takes the square root of the input and returns the result as the final output of the flow.
This example flow can be visualized as follows:
To create a Domino Flow:
-
Create a workspace using the Domino Standard Environment (DSE), or a custom environment that is built on top of the DSE.
-
Create a file named
add.py
in the root directory. Add the following code to the file to add two integer inputs together:from pathlib import Path # Read inputs a = Path("/workflow/inputs/first_value").read_text() b = Path("/workflow/inputs/second_value").read_text() # Calculate sum sum = int(a) + int(b) print(f"The sum of {a} + {b} is {sum}") # Write output Path("/workflow/outputs/sum").write_text(str(sum))
-
Create a file named
sqrt.py
in the root directory. Add the following code to the file to calculate the square root of the input:from pathlib import Path # Read input value = Path("/workflow/inputs/value").read_text() # Calculate square root sqrt = int(value) ** 0.5 print(f"The square root of {value} is {sqrt}") # Write output Path("/workflow/outputs/sqrt").write_text(str(sqrt))
-
Create a file named
workflow.py
in the root directory. Add the following code to the file to define the flow:from flytekit import workflow from flytekitplugins.domino.task import DominoJobConfig, DominoJobTask @workflow def simple_math_workflow(a: int, b: int) -> float: # Create first task add_task = DominoJobTask( name='Add numbers', domino_job_config=DominoJobConfig(Command="python add.py"), inputs={'first_value': int, 'second_value': int}, outputs={'sum': int}, use_latest=True ) sum = add_task(first_value=a, second_value=b) # Create second task sqrt_task = DominoJobTask( name='Square root', domino_job_config=DominoJobConfig(Command="python sqrt.py"), inputs={'value': int}, outputs={'sqrt': float}, use_latest=True ) sqrt = sqrt_task(value=sum) return sqrt
-
Commit the code and run the following command in the Workspace terminal to register and run the flow:
pyflyte run --remote workflow.py simple_math_workflow --a 10 --b 6
-
Once you run the command above, navigate to Flows > Flow name > Run Name in the Domino UI to monitor the results and view the outputs that were produced by the execution.
-
To visualize the full execution flow, click on the Graph pivot.
Rather than beginning from scratch, you can start from a pre-built ecosystem template from Domino’s AI Hub. This section uses a template that demonstrates a basic training flow example with the following steps:
-
Data is loaded in from two different sources and a snapshot of the data is taken.
-
The data is merged together as a single dataset.
-
Basic preprocessing is done on the dataset.
-
A model is trained using the cleaned dataset.
The training flow can be visualized as follows:
To create the training flow:
-
Make a fork of the template GitHub repository.
-
Create a Workspace using the Domino Standard Environment (DSE), or a custom environment that is built on top of the DSE.
-
Inspect the
flow.py
file for the definition of the flow. Note how a helper method, calledrun_domino_job_task
, is used here instead of theDominoJobConfig
andDominoJobTask
in the basic example above.task1 = run_domino_job_task( flyte_task_name='Load Data A', command='python /mnt/code/scripts/load-data-A.py', hardware_tier_name='Small', inputs=[ Input(name='data_path', type=str, value=data_path_a) ], output_specs=[ Output(name='datasetA', type=FlyteFile[TypeVar('csv')]) ], use_project_defaults_for_omitted=True ) task2 = run_domino_job_task( flyte_task_name='Load Data B', command='python /mnt/code/scripts/load-data-B.py', hardware_tier_name='Small', inputs=[ Input(name='data_path', type=str, value=data_path_b) ], output_specs=[ Output(name='datasetB', type=FlyteFile[TypeVar('csv')]) ], use_project_defaults_for_omitted=True ) # Additional tasks
-
Run the following command in the Workspace terminal to register and run the flow:
pyflyte run --remote flow.py model_training --data_path_a /mnt/code/data/datasetA.csv --data_path_b /mnt/code/data/datasetB.csv
-
Navigate to Flows > Flow name > Run name to monitor the results and view the outputs that were produced by the execution.