Ray.io is a distributed execution framework that makes it easy to scale your single machine applications, with little or no changes, and to leverage state-of-the-art machine learning libraries.
Ray provides a set of core low-level primitives as well as a family of pre-packaged libraries that take advantage of these primitives to enable solving powerful machine learning problems.
The following libraries come packaged with Ray:
Additionally, Ray has been adopted as a foundational framework by a large number of open source ML frameworks which now have community maintained Ray integrations.
Domino can dynamically provision and orchestrate a Ray cluster directly on the infrastructure backing the Domino instance. This allows Domino users to get quick access to Ray without having to rely on their IT team to create and manage a cluster for them.
See Domino’s quick-start-ray project.
Domino on-demand Ray clusters are suitable for the following workloads:
- Distributed multi-node training
RaySGD provides a lightweight mechanism for taking existing PyTorch and Tensorflow models and scaling them across multiple machines to dramatically reduce training times. Ray is suitable for both distributed CPU and GPU training.
- Hyperparameter optimization
Launch a distributed hyperparameter sweep with just a few lines of code and no adaptation of your existing training harness. Take advantage of a large number of advanced parameter search algorithms.
- Reinforcement learning
Ray, in combination with the RLlib library, allows you to take advantage of a number of built-in reinforcement learning algorithms but also provides a general framework for developing your own.