domino logo
Latest (5.8)
  • Overview
  • Domino Cloud
  • Domino Nexus
  • Code Assist
  • Get started
  • Work with data
  • Develop models
  • Scale out distributed computing
  • Register and govern models
  • Deploy models
  • Monitor models
  • Publish Apps
  • Projects
  • Collaborate
  • Workspaces
  • Jobs
  • Environments
  • Executions
  • Launchers
  • Environment variables
  • Secure credential store
  • Organizations
  • Domino API
  • Domino CLI
  • Troubleshooting
  • Get help
  • Additional resources
  • Send feedback
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining
>
User guide
>
Scale out distributed computing
>
Distributed GPUs with Open MPI

Distributed GPUs with Open MPI

Note

Message Passing Interface (MPI), is a communication protocol for distributed parallel computing. Domino validates the use of Open MPI, a popular open-source MPI distribution that is widely used in high performance computing.

Open MPI has these features:

  • Leading open source MPI distribution: Open MPI provides low-latency and high bandwidth, gradual parallelism, and flexibility.

  • Support for machine learning in high performance environments: MPI is the underlying communication mechanism for higher-level machine learning training libraries. MPI is often used in Horovod to train models in high-performance environments.

Orchestrate Open MPI on Domino

Domino can dynamically provision and orchestrate an MPI cluster directly on the infrastructure backing the Domino deployment. You get quick access without needing an IT team.

Starting a Domino workspace for interactive work or Domino job for batch processing, Domino creates, manages, and makes available a containerized MPI cluster to your execution.

Use cases

Domino on-demand MPI clusters are suitable for the following workloads:

Distributed multi-GPU training

Open MPI is ideal for distributed multi-GPU and multi-CPU training for Tensorflow, PyTorch, Keras, or MXNet models.

High performance computing

MPI clusters have lower overhead than other distributed computing systems and are highly customizable.

Domino Data Lab
Knowledge Base
Data Science Blog
Training
Copyright © 2023 Domino Data Lab. All rights reserved.