Message Passing Interface (MPI), is a communication protocol for distributed parallel computing. Domino validates the use of Open MPI, a popular open-source MPI distribution that is widely used in high performance computing.
Open MPI has these features:
-
Leading Open Source MPI Distribution Open MPI provides low latency and high bandwidth, gradual parallelism, and flexibility.
-
Support for Machine Learning in High Performance Environments: MPI is the underlying communication mechanism for higher-level machine learning training libraries. MPI is often used in Horovod to train models in high performance environments.
Domino can dynamically provision and orchestrate an MPI cluster directly on the infrastructure backing the Domino deployment. You get quick access without needing an IT team.
Starting a Domino workspace for interactive work or Domino job for batch processing, Domino creates, manages, and makes available a containerized MPI cluster to your execution.
Domino on-demand MPI clusters are suitable for the following workloads:
-
Distributed multi-GPU training: Open MPI is ideal for distributed multi-GPU training for Tensorflow, PyTorch, Keras, or MXNet models.
-
High Performance Computing MPI clusters are faster than other distributed computing systems and highly customizable.