domino logo
5.0
  • Overview
  • Domino Cloud
  • Code Assist
  • Get started
  • Work with data
  • Develop models
  • Scale out distributed computing
  • Deploy models
  • Monitor models
  • Publish Apps
  • Projects
  • Collaborate
  • Workspaces
  • Jobs
  • Environments
  • Executions
  • Launchers
  • Environment variables
  • Secure credential store
  • Organizations
  • Domino API
  • Domino CLI
  • Troubleshooting
  • Get help
  • Additional resources
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining
>
User guide
>
Scale out distributed computing
>
On-demand Ray

On-demand Ray

Ray.io is a distributed execution framework that makes it easy to scale your single machine applications, with little or no changes, and to leverage state-of-the-art machine learning libraries.

Ray provides a set of core low-level primitives as well as a family of pre-packaged libraries that take advantage of these primitives to enable solving powerful machine learning problems.

The following libraries come packaged with Ray:

  • Tune: Scalable Hyperparameter Tuning

  • RaySGD: Distributed Training Wrappers

  • RLlib: Industry-Grade Reinforcement Learning

  • Ray Serve: Scalable and Programmable Serving

Additionally, Ray has been adopted as a foundational framework by a large number of open source ML frameworks which now have community-maintained Ray integrations.

Orchestrate Ray on Domino

Domino can dynamically provision and orchestrate a Ray cluster directly on the infrastructure backing the Domino instance. This allows Domino users to get quick access to Ray without having to rely on their IT team.

When you start a Domino workspace for interactive work or a Domino job for batch processing, Domino will create, manage, and make available a containerized Ray cluster to your execution.

Suitable use cases

Domino on-demand Ray clusters are suitable for the following workloads:

Distributed multi-node training

RaySGD provides a lightweight mechanism for taking existing PyTorch and Tensorflow models and scaling them across multiple machines to dramatically reduce training times. Ray is suitable for both distributed CPU and GPU training.

Hyperparameter optimization

Unresolved directive in <stdin> - include::../../../../../static/content-reuse/clusters/ray-hyperparameter-optimization.adoc[]

Reinforcement learning

Ray, in combination with the RLlib library, allows you to take advantage of a number of built-in reinforcement learning algorithms, but also provides a general framework for developing your own.

Note
Domino Data Lab
Knowledge Base
Data Science Blog
Training
Copyright © 2023 Domino Data Lab. All rights reserved.