Domino on AWS EKS¶
Domino 4 can run on a Kubernetes cluster provided by AWS Elastic Kubernetes Service. When running on EKS, the Domino 4 architecture uses AWS resources to fulfill the Domino cluster requirements as follows:
- Kubernetes control moves to the EKS control plane with managed Kubernetes masters
- Domino uses a dedicated Auto Scaling Group (ASG) of EKS workers to host the Domino platform
- ASGs of EKS workers host elastic compute for Domino executions
- AWS S3 is used to store user data, backups, and logs
- AWS EFS is used to store Domino Datasets
kubernetes.io/aws-ebsprovisioner is used to create persistent volumes for Domino executions
All nodes in such a deployment have private IPs, and internode traffic is routed by internal load balancer. Nodes in the cluster can optionally have egress to the Internet through a NAT gateway.
Setting up an EKS cluster for Domino¶
This section describes how to configure an Amazon EKS cluster for use with Domino.
No namespace configuration is necessary prior to install. Domino will create three namespaces in the cluster during installation, according to the following specifications:
||Durable Domino application, metadata, platform services required for platform operation|
||Ephemeral Domino execution pods launched by user actions in the application|
||Domino installation metadata and secrets|
The EKS cluster must have at least two ASGs that produce worker nodes with the following specifications and distinct node labels, and it may include an optional GPU pool:
platform ASG can run in 1 availability zone or across 3 availability zones. If you want Domino to run with
some components deployed as highly available ReplicaSets
you must use 3 availability zones. Using 2 zones is not supported, as it results in an even number of nodes in a single
To run the
default-gpu pools across multiple availability zones, you will need duplicate ASGs in
each zone with the same configuration, including the same labels, to ensure pods are delivered to the zone where the
required ephemeral volumes are available.
Additional ASGs can be added with distinct
dominodatalab.com/node-pool labels to make other instance types available
for Domino executions. Read Managing the Domino compute grid
to learn how these different node types are referenced by label from the Domino application.
By default, AWS AMIs do not have bridge networking enabled for Docker containers. Domino requires this for environment
--enable-docker-bridge true to the user data of the launch configuration used by all Domino ASG nodes.
- Create a copy of the launch configuration used by each Domino ASG.
- Open the User data field and add
--enable-docker-bridge trueto the copied launch configuration.
- Switch the Domino ASGs to use the new launch configuration.
- Drain any existing nodes in the ASG.
Dynamic block storage¶
The EKS cluster must be equipped with an EBS-backed storage class that Domino will use to provision ephemeral volumes for user execution. Consult the following storage class specification as an example.
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: domino-compute-storage provisioner: kubernetes.io/aws-ebs parameters: type: gp2 fsType: ext4
An EFS file system must be provisioned and configured to allow access from the EKS cluster’s
ClusterSharedNodeSecurityGroup via TCP on port 2049. Configure a mount point for each availability zone that
Domino compute nodes will run in, and record these mount addresses for use when installing Domino.
When running in EKS, Domino can use Amazon S3 for durable object storage.
Create the following three S3 buckets:
- 1 bucket for user data
- 1 bucket for logs
- 1 bucket for backups
Configure each bucket to permit read and write access from the EKS cluster. This involves applying an IAM policy to the nodes in the cluster that grants the following actions on the target buckets:
Record the names of these buckets for use when installing Domino.
Domino will need to be configured to serve from a specific FQDN. To serve Domino securely over HTTPS, you will also need an SSL certificate that covers the chosen name. Record the FQDN for use when installing Domino.
Sample cluster configuration¶
See below for a sample YAML configuration file you can use with eksctl, the official EKS command line tool, to create a Domino-compatible cluster.
Note that after creating a cluster with this configuration, you must still create the EFS and S3 storage systems and configure them for access from the cluster as described above.
# $LOCAL_DIR/cluster.yaml apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: domino-test-cluster region: us-west-2 nodeGroups: - name: domino-platform instanceType: m5.4xlarge minSize: 3 maxSize: 3 desiredCapacity: 3 volumeSize: 128 availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2c"] labels: "dominodatalab.com/node-pool": "platform" - name: domino-default instanceType: m5.2xlarge minSize: 0 maxSize: 10 desiredCapacity: 1 volumeSize: 400 availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2c"] labels: "dominodatalab.com/node-pool": "default" "domino/build-node": "true" tags: "k8s.io/cluster-autoscaler/node-template/label/dominodatalab.com/node-pool": "default" "k8s.io/cluster-autoscaler/node-template/label/domino/build-node": "true" preBootstrapCommands: - "cp /etc/docker/daemon.json /etc/docker/daemon_backup.json" - "echo -e '.bridge=\"docker0\" | .\"live-restore\"=false' > /etc/docker/jq_script" - "jq -f /etc/docker/jq_script /etc/docker/daemon_backup.json | tee /etc/docker/daemon.json" - "systemctl restart docker" - name: domino-gpu instanceType: p2.8xlarge minSize: 0 maxSize: 5 volumeSize: 400 availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2c"] ami: ami-0ad9a8dc09680cfc2 labels: "dominodatalab.com/node-pool": "default-gpu" "nvidia.com/gpu": "true" tags: "k8s.io/cluster-autoscaler/node-template/label/dominodatalab.com/node-pool": "default-gpu" availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2c"]