Starting with Domino 4.3.1, the Domino platform is compatible to run on the OpenShift Container Platform (OCP) and OpenShift Kubernetes Engine (OKE). Domino supports the OCP/OKE version 4.4+.
This section describes how to configure an OpenShift Kubernetes Engine cluster for use with Domino.
Namespaces
No namespace configuration is necessary prior to install. Domino will create three namespaces in the cluster during installation, according to the following specifications:
Namespace | Contains |
---|---|
| Durable Domino application, metadata, platform services required for platform operation |
| Ephemeral Domino execution pods launched by user actions in the application |
| Domino installation metadata and secrets |
Node pools
The OpenShift cluster must have worker nodes with the following specifications and distinct node labels, and it might include an optional GPU pool:
Pool | Min-Max | vCPU | Memory | Disk | Labels |
---|---|---|---|---|---|
| Min 3 | 8 | 32G | 128G |
|
| 1-20 | 8 | 32G | 400G |
|
Optional: | 0-5 | 8 | 32G | 400G |
|
More generally, the platform
worker nodes need an aggregate minimum of 24 CPUs and 96G of memory. Spreading the resources across multiple nodes with proper failure isolation (for example, availability zones) is recommended.
Managing nodes and node pools in OpenShift is done through Machine Management and the Machine API. For each node pool above, you must create a MachineSet.
Provide the Domino-required labels in the Machine spec (spec.template.spec.metadata.labels
stanza). Also, update any provider spec per your infrastructure provider of choice and sizing
(spec.template.spec.providerSpec
stanza); for example, in AWS, updates might include, but are not limited to: AMI ID, block device storage sizing, and availability zone placement.
The following is an example MachineSet for the platform
node pool:
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
labels:
machine.openshift.io/cluster-api-cluster: firestorm-dxcpd
name: firestorm-dxcpd-platform-us-west-1a
namespace: openshift-machine-api
spec:
replicas: 3
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: firestorm-dxcpd
machine.openshift.io/cluster-api-machineset: firestorm-dxcpd-platform-us-west-1a
template:
metadata:
labels:
machine.openshift.io/cluster-api-cluster: firestorm-dxcpd
machine.openshift.io/cluster-api-machine-role: platform
machine.openshift.io/cluster-api-machine-type: platform
machine.openshift.io/cluster-api-machineset: firestorm-dxcpd-platform-us-west-1a
spec:
metadata:
labels:
node-role.kubernetes.io/default: ""
dominodatalab.com/node-pool: platform
providerSpec:
value:
ami:
id: ami-02b6556210798d665
apiVersion: awsproviderconfig.openshift.io/v1beta1
blockDevices:
- ebs:
iops: 0
volumeSize: 120
volumeType: gp2
credentialsSecret:
name: aws-cloud-credentials
deviceIndex: 0
iamInstanceProfile:
id: firestorm-dxcpd-worker-profile
instanceType: m5.2xlarge
kind: AWSMachineProviderConfig
metadata:
creationTimestamp: null
placement:
availabilityZone: us-west-1a
region: us-west-1
publicIp: null
securityGroups:
- filters:
- name: tag:Name
values:
- firestorm-dxcpd-worker-sg
subnet:
filters:
- name: tag:Name
values:
- firestorm-dxcpd-private-us-west-1a
tags:
- name: kubernetes.io/cluster/firestorm-dxcpd
value: owned
userDataSecret:
name: worker-user-data
The following is an example
MachineSet
for the default
(compute) node pool:
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
labels:
machine.openshift.io/cluster-api-cluster: firestorm-dxcpd
name: firestorm-dxcpd-default-us-west-1a
namespace: openshift-machine-api
spec:
replicas: 3
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: firestorm-dxcpd
machine.openshift.io/cluster-api-machineset: firestorm-dxcpd-default-us-west-1a
template:
metadata:
labels:
machine.openshift.io/cluster-api-cluster: firestorm-dxcpd
machine.openshift.io/cluster-api-machine-role: default
machine.openshift.io/cluster-api-machine-type: default
machine.openshift.io/cluster-api-machineset: firestorm-dxcpd-default-us-west-1a
spec:
metadata:
labels:
node-role.kubernetes.io/default: ""
dominodatalab.com/node-pool: default
domino/build-node: "true"
providerSpec:
value:
ami:
id: ami-02b6556210798d665
apiVersion: awsproviderconfig.openshift.io/v1beta1
blockDevices:
- ebs:
iops: 0
volumeSize: 400
volumeType: gp2
credentialsSecret:
name: aws-cloud-credentials
deviceIndex: 0
iamInstanceProfile:
id: firestorm-dxcpd-worker-profile
instanceType: m5.2xlarge
kind: AWSMachineProviderConfig
metadata:
creationTimestamp: null
placement:
availabilityZone: us-west-1a
region: us-west-1
publicIp: null
securityGroups:
- filters:
- name: tag:Name
values:
- firestorm-dxcpd-worker-sg
subnet:
filters:
- name: tag:Name
values:
- firestorm-dxcpd-private-us-west-1a
tags:
- name: kubernetes.io/cluster/firestorm-dxcpd
value: owned
userDataSecret:
name: worker-user-data
Node Autoscaling
For clusters on top of elastic cloud provider, node autoscaling (or Machine autoscaling) is achieved by creating ClusterAutoscaler and MachineAutoscaler resources.
The following is an example ClusterAutoscaler:
apiVersion: "autoscaling.openshift.io/v1"
kind: "ClusterAutoscaler"
metadata:
name: "default"
spec:
podPriorityThreshold: -10
resourceLimits:
maxNodesTotal: 20
cores:
min: 8
max: 256
memory:
min: 4
max: 256
gpus:
- type: nvidia.com/gpu
min: 0
max: 16
- type: amd.com/gpu
min: 0
max: 4
scaleDown:
enabled: true
delayAfterAdd: 10m
delayAfterDelete: 5m
delayAfterFailure: 30s
unneededTime: 10m
The following is an example
MachineAutoscaler
for the
MachineSet
created for the default
node pool:
apiVersion: "autoscaling.openshift.io/v1beta1"
kind: "MachineAutoscaler"
metadata:
name: "firestorm-dxcpd-default-us-west-1a"
namespace: "openshift-machine-api"
spec:
minReplicas: 1
maxReplicas: 5
scaleTargetRef:
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
name: firestorm-dxcpd-default-us-west-1a
Domain
Domino must be configured to serve from a specific FQDN. To serve Domino securely over HTTPS, you will also need an SSL certificate that covers the chosen name.
Network Plugin
Domino relies on Kubernetes network policies to manage secure communication between pods in the cluster. By default, OpenShift uses the Cluster Network Operator to deploy the OpenShift SDN default CNI network provider plugin, which support network policies and hence should just work.
Ingress
Domino uses the NGNIX ingress controller maintained by the Kubernetes project instead of (but does not replace) the OpenShift implemented HAProxy-based ingress controller and deploys the ingress controller as a node port service.
By default, the ingress listens on node ports 443 (HTTPS) and 80 (HTTP).
Load Balancer
A load balancer must be set up to use your DNS name. For example, in AWS, you must setup the DNS so it points a CNAME at an Elastic Load Balancer.
After you complete the installation process, you must configure the load balancer to balance across the platform nodes at the ports specified by your ingress.
Container registry
Domino deploys its own container image registry instead of using the
OpenShift built in container image registry. During installation, the OpenShift
cluster image configuration is
modified to trust the Domino certificate authority (CA). This is done to ensure that OpenShift can run pods using Domino’s custom built images. In the images.config.openshift.io/cluster
resource, you can find a reference to a ConfigMap that contains the Domino CA.
spec:
additionalTrustedCA:
name: domino-deployment-registry-config