Before you can start using on-demand Ray clusters on Domino you need to ensure that this functionality is enabled and properly configured on your deployment.
Domino on-demand Ray functionality is available starting with Domino 4.5.
Domino Administrators need to:
Enable on-demand Ray functionality
ShortLived.RayClustersEnabledfeature flag to
The flag is on by default unless a Domino Administrator chooses to disable it for a deployment.
By default, Domino does not include a Ray compatible Compute Environment that can be used for the components of the cluster. Without at least one such environment available, you will not be able to create a cluster.
Note that when using on-demand Ray in Domino you will have two separate environments - one for the Ray cluster and one for the workspace/job.
Select the Custom Image option and specify an image URI that points to a deployable Ray image.
It is recommended that you use the latest release tag for your desired version of Ray from the a available options published by the Ray project at https://hub.docker.com/u/rayproject.
The images available include a core image as well as an image with all required prerequisites for ML workload. Additionally, the images are available for different python versions as well as for CPU and GPU variants.
For example, for Ray 1.3.0 for Python 3.8 the options are:
Select the Domino managed Ray option (REQUIRED). This will ensure that the environment will be available for use when creating Ray clusters from workspaces and jobs.
You can set this attribute the same way you would for any other Compute Environment based on your desired visibility.
Leave blank to use the images as provided by Ray project.
You can modify this sections to include any additional packages that may be necessary for your workloads and need to be available on the Ray cluster nodes.
To learn more, refer to Managing dependencies
Pluggable Notebooks / Workspace Sessions
This section should remain blank as the Ray base environments are not intended to also include notebook configuration.
In addition to the base Ray cluster environment, you also need to configure the Ray compute environments for workspaces and/or jobs that will connect to your cluster.
You can either enhance the Docker Instructions section of an existing environment or create a new environment that uses an existing environment as its base.
You can use the following instructions and adapt them as needed.
### Change Ray version as needed ENV RAY_VERSION=1.3.0 ### Change Ray version as needed. ENV RAY_VERSION=1.3.0 ### If you want install Ray RLlib or "all", which includes it, you must ### install "cmake" first. RUN sudo apt-get install -y cmake ### Change this depending on which Ray extras you want to install: ### All options are ray, ray[tune], ray[rllib], ray[serve] ### If you want everything you can just use ray[all]. ### See note above the "cmake" is required for ray[rllib] or ray[all]. RUN pip install ray[all]==$RAY_VERSION ### Add any additional packages that you may need which are not included ### in the base image you are using for the compute environment. You would ### want the versions of these to match the versions of these packages on ### the base Ray cluster image. ### For example, for Torch you may include ### RUN pip install torch==1.8.0 torchvision==0.9.0 USER ubuntu
The Python and Ray versions between your Domino execution compute environment and your Ray cluster compute environment must match.