Configuring prerequisites

Before you can start using on-demand Ray clusters on Domino you need to ensure that this functionality is enabled and properly configured on your deployment.

Note

Domino on-demand Ray functionality is available starting with Domino 4.5.

Enabling Ray on your deployment

Domino Administrators need to:

  • Enable on-demand Ray functionality

    Set ShortLived.RayClustersEnabled feature flag to true.

    The flag is on by default unless a Domino Administrator chooses to disable it for a deployment.

Creating a base Ray cluster environment

By default, Domino does not include a Ray compatible Compute Environment that can be used for the components of the cluster. Without at least one such environment available, you will not be able to create a cluster.

Note that when using on-demand Ray in Domino you will have two separate environments - one for the Ray cluster and one for the workspace/job.

To create a new base Ray cluster environment, you will follow the general Environment Management with the following Environment attributes.

new_ray_base_environment_modal

  • Base image

    Select the Custom Image option and specify an image URI that points to a deployable Ray image.

    It is recommended that you use the latest release tag for your desired version of Ray from the a available options published by the Ray project at https://hub.docker.com/u/rayproject.

    The images available include a core image as well as an image with all required prerequisites for ML workload. Additionally, the images are available for different python versions as well as for CPU and GPU variants.

    For example, for Ray 1.3.0 for Python 3.8 the options are:

    • rayproject/ray:1.3.0-py38

    • rayproject/ray:1.3.0-py38-gpu

    • rayproject/ray-ml:1.3.0-py38

    • rayproject/ray-ml:1.3.0-py38-gpu

  • Supported clusters

    Select the Domino managed Ray option (REQUIRED). This will ensure that the environment will be available for use when creating Ray clusters from workspaces and jobs.

  • Visibility

    You can set this attribute the same way you would for any other Compute Environment based on your desired visibility.

  • Dockerfile Instructions

    Leave blank to use the images as provided by Ray project.

    You can modify this sections to include any additional packages that may be necessary for your workloads and need to be available on the Ray cluster nodes.

    To learn more, refer to Managing dependencies

  • Pluggable Notebooks / Workspace Sessions

    This section should remain blank as the Ray base environments are not intended to also include notebook configuration.

Preparing your Ray compute environment

In addition to the base Ray cluster environment, you also need to configure the Ray compute environments for workspaces and/or jobs that will connect to your cluster.

You can either enhance the Docker Instructions section of an existing environment or create a new environment that uses an existing environment as its base.

You can use the following instructions and adapt them as needed.

### Change Ray version as needed
ENV RAY_VERSION=1.3.0

### Change Ray version as needed.
ENV RAY_VERSION=1.3.0

### If you want install Ray RLlib or "all", which includes it, you must
### install "cmake" first.
RUN sudo apt-get install -y cmake

### Change this depending on which Ray extras you want to install:
### All options are ray, ray[tune], ray[rllib], ray[serve]
### If you want everything you can just use ray[all].
### See note above the "cmake" is required for ray[rllib] or ray[all].
RUN pip install ray[all]==$RAY_VERSION

### Add any additional packages that you may need which are not included
### in the base image you are using for the compute environment. You would
### want the versions of these to match the versions of these packages on
### the base Ray cluster image.
### For example, for Torch you may include
### RUN pip install torch==1.8.0 torchvision==0.9.0

USER ubuntu

Note

The Python and Ray versions between your Domino execution compute environment and your Ray cluster compute environment must match.