domino logo
About DominoArchitecture
Kubernetes
Cluster RequirementsDomino on EKSDomino Kubernetes Version CompatibilityDomino on GKEDomino on AKSDomino on OpenShiftNVIDIA DGX in DominoDomino in Multi-Tenant Kubernetes ClusterEncryption in Transit
Installation
Installation ProcessConfiguration ReferenceInstaller Configuration ExamplesPrivate or Offline Installationfleetcommand-agent release notes
Configuration
Central ConfigurationNotificationsChange The Default Project For New UsersProject Stage ConfigurationDomino Integration With Atlassian Jira
Compute
Manage Domino Compute ResourcesHardware Tier Best PracticesModel Resource QuotasPersistent Volume ManagementAdding a Node Pool to your Domino ClusterRemove a Node from Service
Keycloak Authentication Service
Operations
Domino Application LoggingDomino MonitoringSizing Infrastructure for Domino
Data Management
Data in DominoData Flow In DominoExternal Data VolumesDatasets AdministrationSubmit GDPR Requests
User Management
RolesView User InformationRun a User Activity ReportSchedule a User Activity Report
Environments
Environment Management Best PracticesCache Environment Images in EKS
Disaster Recovery
Control Center
Control Center OverviewExport Control Center Data with The API
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining
Admin Guide
>
Environments
>
Cache Environment Images in EKS

Cache Environment Images in EKS

When a user launches a Domino Run, part of the start-up process is loading the user’s environment onto the node that will host the Run. For large images, the process of transferring the image to a new node can take several minutes. Once an image has been loaded onto a node once, it gets cached, and future Runs that use the same environment will start up faster.

When running Domino on EKS, you can pre-cache popular environments and base images on the Amazon Machine Image (AMI) used for new nodes. This can speed up the start time of Runs on new nodes significantly. This page describes the process of creating a new AMI with cached environments and configuring EKS to use it for new nodes.

AMI requirements

In addition to any dependencies required by Kubernetes itself, your AMI must contain the following:

  • Docker

  • Cache of Domino’s compute environments

  • Nvidia-Docker 2 (GPU nodes only)

  • Nvidia GPU driver 410+ (GPU nodes only)

  • Change the default docker runtime (GPU nodes only)

For simplicity, Domino recommends that you use the official EKS default AMIs, which come pre-configured with Docker and the GPU tools.

  • Read about the official EKS AMI Domino recommends for default compute nodes

  • Read about the official EKS AMI Domino recommends for GPU nodes.

Alternatively, you can use Amazon’s build scripts to create your own AMI for use with EKS.

AMI operations

The following sections describe how to perform several important types of operations on an EC2 instance to set it up as the template for a new AMI suitable for Domino.

Install Docker

Read the official instructions about how to install Docker.

Pull environment images

Pre-caching environment images is a simple process of running docker pull for the base images those environments are built on, or the built environments from the internal registry itself.

To pull the Domino Standard Environment base images, your command would look like this, substituting in the version string for the image you want to cache.

docker pull quay.io/domino/base:<desired version>

To pull a built image from the Domino internal registry, you must find its URI from the Revisions tab in the environment details page.

environment image url

For example, to cache revision #9 of the environment shown in the previous screenshot, you would run:

docker pull 100.97.56.113:5000/domino-5d7abf2715f3690007f23081:9

Install NVIDIA Docker 2.0 (GPU AMIs only)

Read the official instructions for installing the nvidia-docker 2.0 runtime.

Install GPU drivers (GPU AMIs only)

To use the GPU on a GPU node, you must install the appropriate driver on the machine image. Domino does not have a requirement for any specific driver version, however, if you want to use a Domino Standard Environment, it must be a version that is compatible with the current version of Cuda shown in standard environments.

View a compatibility matrix.

If you’d like to install the GPU drivers manually, you can follow these instructions.

To validate that your GPU machine is configured properly, reboot the machine and run the following:

docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

This will show the driver number and GPU devices if installed successfully.

Change the default Docker runtime (GPU AMIs only)

Read the official instructions from NVIDIA about using the container runtime.

You must restart Docker before this will work.

Complete AMI caching procedure

  1. Determine which AMI you want to use as the base for the new AMI. If you’re performing this operation on an operational Domino node pool, you must use the AMI that’s currently used in the active launch configuration.

    launch config name

    After you’ve identified the name of the active launch configuration, view its details to see the AMI ID it uses.

    ami id

  2. Launch a new EC2 instance from the base AMI.

  3. Connect to the instance through SSH and perform any of the operations listed previously that you want to apply to your new AMI, including pulling any environment images you want to cache.

  4. Snap a new AMI from the EC2 instance.

  5. Create a copy of the launch configuration currently used by any ASGs you want to switch to using the new AMI.

  6. Edit the AMI for the copied launch configuration to be the ID of the new AMI you snapped.

  7. For any ASGs that you want to start using the new AMI, switch them over to the new launch configuration.

After you complete the final step, any ASGs you switched to using the new launch configuration will start using the new AMI whenever they create new nodes. These new nodes will therefore have any environment images you pulled onto the AMI template cached, and will be fast to start new Domino Runs.

Domino Data LabKnowledge BaseData Science BlogTraining
Copyright © 2022 Domino Data Lab. All rights reserved.