Python bindings for the Domino API.
Permits interaction with a Domino deployment from Python using the Domino API.
The latest released version is 1.0.4.
At this time, these Domino Python bindings are not in PyPi.
You can install the latest version of this package from our Github master
branch with the following:
pip install https://github.com/dominodatalab/python-domino/archive/master.zip
If you are adding install instructions for python-domino
to your Domino Environment Dockerfile Instructions
field, you must add RUN
to the beginning:
RUN pip install https://github.com/dominodatalab/python-domino/archive/master.zip
You can also add python-domino
to your requirements.txt
file with the following syntax:
-f git+git://github.com/dominodatalab/python-domino.git
Note
|
To install lower version of library, for example 0.3.5 use the following command:
|
pip install https://github.com/dominodatalab/python-domino/archive/0.3.5.zip
class Domino(project, api_key=None, host=None, domino_token_file=None)
The parameters are:
-
project: A project identifier (in the form of ownerusername/projectname)
-
api_key: (Optional) An API key to authenticate with. If not provided the library will expect to find one in the DOMINO_USER_API_KEY environment variable.
-
host: (Optional) A host URL. If not provided the library will expect to find one in the DOMINO_API_HOST environment variable.
-
domino_token_file: (Optional) Path to domino token file containing auth token. If not provided the library will expect to find one in the DOMINO_TOKEN_FILE environment variable.
Note:
-
In case both api_key and domino_token_file are available, then preference will be given to domino_token_file.
-
By default the log level is set to
INFO
, to set log level toDEBUG
, setDOMINO_LOG_LEVEL
environment variable toDEBUG
project_create(project_name, owner_username=None):
Create a new project with given project name. The parameters are:
-
project_name: The name of the project
-
owner_username: (Optional) The owner username for the project. This parameter is useful in-case project needs to be created under some organization.
runs_start(command, isDirect, commitId, title, tier, publishApiEndpoint)
Start a new run on the selected project. The parameters are:
-
command: The command to run as an array of strings where members of the array represent arguments of the command. E.g.
["main.py", "hi mom"]
-
isDirect: (Optional) Whether or not this command should be passed directly to a shell.
-
commitId: (Optional) The commitId to launch from. If not provided, will launch from latest commit.
-
title: (Optional) A title for the run.
-
tier: (Optional) The hardware tier to use for the run. This is the human-readable name of the hardware tier, such as "Free", "Small", or "Medium". Will use project default tier if not provided.
-
publishApiEndpoint: (Optional) Whether or not to publish an API endpoint from the resulting output.
runs_start_blocking(command, isDirect, commitId, title, tier, publishApiEndpoint, poll_freq=5, max_poll_time=6000)
Same as method run_start
except make a blocking request that waits until job is finished.
-
command: The command to run as an array of strings where members of the array represent arguments of the command. E.g.
["main.py", "hi mom"]
-
isDirect: (Optional) Whether or not this command should be passed directly to a shell.
-
commitId: (Optional) The commitId to launch from. If not provided, will launch from latest commit.
-
title: (Optional) A title for the run.
-
tier: (Optional) The hardware tier to use for the run. Will use project default tier if not provided.
-
publishApiEndpoint: (Optional) Whether or not to publish an API endpoint from the resulting output.
-
poll_freq: (Optional) Number of seconds in between polling of the Domino server for status of the task that is running.
-
max_poll_time: (Optional) Maximum number of seconds to wait for a task to complete. If this threshold is exceeded, an exception is raised.
-
retry_count: (Optional) Maximum number of retry to do while polling (in-case of transient http errors). If this threshold is exceeded, an exception is raised.
files_upload(path, file)
Upload a Python file object into the specified path inside the project.
See examples/upload_file.py
for an example.
The parameters, both of which are required, are:
-
path: The path to save the file to. For example,
/README.md
will write to the root directory of the project while/data/numbers.csv
will save the file to a subfolder nameddata
(if thedata
folder does not yet exist, it will be created) -
file: A Python file object. For example,
f = open("authors.txt","rb")
app_publish(unpublishRunningApps=True, hardwareTierId=None)
Publishes an app in the Domino project, or republish an existing app. The parameters are:
-
unpublishRunningApps: (Defaults to True) Will check for any active app instances in the current project and unpublish them before publishing.
-
hardwareTierId: (Optional) Will launch the app on the specified hardware tier. Only applies for Domino 3.4+.
job_start(command, commit_id=None, hardware_tier_name=None, environment_id=None, on_demand_spark_cluster_properties=None):
Starts a new Job (run) in the project
-
command (string): Command to execute in Job. Ex
domino.job_start(command="main.py arg1 arg2")
-
commit_id (string): (Optional) The commitId to launch from. If not provided, will launch from latest commit.
-
hardware_tier_name (string): (Optional) The hardware tier NAME to launch job in. If not provided it will use the default hardware tier for the project
-
environment_id (string): (Optional) The environment id to launch job with. If not provided it will use the default environment for the project
-
on_demand_spark_cluster_properties (dict): (Optional) On demand spark cluster properties. Following properties can be provided in spark cluster
{ "computeEnvironmentId": "<Environment ID configured with spark>" "executorCount": "<Number of Executors in cluster>" (optional defaults to 1) "executorHardwareTierId": "<Hardware tier ID for Spark Executors>" (optional defaults to last used historically if available) "masterHardwareTierId": "<Hardware tier ID for Spark master" (optional defaults to last used historically if available) "executorStorageMB": "<Executor's storage in MB>" (optional defaults to 0; 1GB is 1000MB Here) }
The python-domino
client comes bundled with an Operator for use with airflow as an extra.
To install its dependencies, when installing the package from github add the airflow
flag to extras with pip.
pip install -e git+https://github.com/dominodatalab/python-domino.git@master#egg=domino[airflow]
DominoOperator
from domino.airflow import DominoOperator
Allows a user to schedule domino runs via airflow.
Follows the same function signature as domino.runs_start
with two extra arguments:
-
startup_delay: Optional[int] = 10
Add a startup delay to your job, useful if you want to delay execution until after other work finishes.
-
include_setup_log: Optional[bool] = True
Determine whether or not to publish the setup log of the job as the log prefix before
stdout
.
This library is made available under the Apache 2.0 License. This is an open-source project of Domino Data Lab.
You can find the complete library, with documentation and example code, in the public repository at https://github.com/dominodatalab/python-domino.