Before you can start using on-demand Ray clusters on Domino, you must ensure that this functionality is enabled and properly configured on your deployment.
Note
|
Domino on-demand Ray functionality is available starting with Domino 4.5. |
Domino administrators must enable on-demand Ray functionality by setting the ShortLived.RayClustersEnabled
feature flag to true
.
The flag is on by default unless a Domino administrator disables it for a deployment.
By default, Domino does not include a Ray compatible compute environment that can be used for the components of the cluster. Without at least one such environment available, you cannot create a cluster.
When using on-demand Ray in Domino, you will have two separate environments, one for the Ray cluster (base or worker environment) and one for the workspace/job execution (compute environment).
To create a new base Ray cluster environment, follow the
general
Environment management with
the following environment_attributes
:
-
Base Image
Select Custom Image and enter an image URI that points to a deployable Ray image.
Domino recommends that you use the latest release tag for your version of Ray from the options published at the cluster-environment-images repository. Domino’s repository contains the latest Ray images curated for Domino. You can also get Ray images from the Rayproject repository.
The images available include a core image as well as an image with all required prerequisites for ML workload. Additionally, the images are available for different Python versions, as well as for CPU and GPU variants.
For example, for Ray 1.3.0 and Python 3.8 the options are:
-
rayproject/ray:1.3.0-py38
-
rayproject/ray:1.3.0-py38-gpu
-
rayproject/ray-ml:1.3.0-py38
-
rayproject/ray-ml:1.3.0-py38-gpu
-
-
Supported Clusters
Select Domino managed Ray (Required). This ensures that the environment will be available for use when creating Ray clusters from workspaces and jobs.
-
Visibility
You can set this attribute the same way you would for any other compute environment based on your desired visibility.
-
Dockerfile Instructions
Add the following 2 lines:
USER root RUN usermod -u 12574 ray
You can modify this section to include additional packages that might be necessary for your workloads and must be available on the Ray cluster nodes.
See Manage dependencies to learn more.
-
Pluggable Notebooks / Workspace Sessions
Leave this section blank because the Ray base environments are not intended to include notebook configuration.
In addition to the base Ray cluster environment, you must configure the Ray compute environments for workspaces and/or jobs that will connect to your cluster.
Domino recommends that you use the Ray base image to create a compatible workspace. See Domino Ray environment for more information about this base image.
Customize this Workspace compute environment:
-
Use the image mentioned previously and add Docker Instructions.
-
Use your own image and customizations. Then, use the following Docker Instructions to add the Ray packages.
USER root ### Change Ray version as needed. ENV RAY_VERSION=<ENTER_RAY_VERSION> ### If you want install Ray RLlib or "all", which includes it, you must ### install "cmake" first. RUN sudo apt-get install -y cmake ### Change this depending on which Ray extras you want to install: ### All options are ray, ray[tune], ray[rllib], ray[serve]. ### If you want everything you can just use ray[all]. ### See note above the "cmake" is required for ray[rllib] or ray[all]. RUN pip install ray[all]==$RAY_VERSION ### Add any additional packages that you may need which are not included ### in the base image you are using for the compute environment. You would ### want the versions of these to match the versions of these packages on ### the base Ray cluster image. ### For example, for Torch you may include ### RUN pip install torch==1.8.0 torchvision==0.9.0 # Set this USER line to match your requirements: USER ubuntu
Note
|
The Python and Ray versions between your Domino execution compute environment and your Ray cluster compute environment must match. |