Domino can run on a Kubernetes cluster provided by the Azure Kubernetes Service. When running on AKS, the Domino architecture uses Azure resources to fulfill the Domino cluster requirements as follows:
-
For a complete Terraform module for Domino-compatible AKS provisioning, see terraform-azure-aks on GitHub.
-
Kubernetes control is handled by the AKS control plane with managed Kubernetes masters
-
The AKS cluster’s default node pool is configured to host the Domino platform * Additional AKS node pools provide compute nodes for user workloads
-
When Domino is deployed in AKS, it is compatible with the
containerd
runtime, which is the AKS default runtime for Kubernetes 1.19 and above. -
When using the
containerd
runtime, Domino images are stored in Azure Container Registry. -
An Azure storage account stores Domino blob data and datasets
-
The
kubernetes.io/azure-disk
provisioner is used to create persistent volumes for Domino executions -
The Advanced Azure CNI is used for cluster networking, with network policy enforcement handled by Calico
-
Ingress to the Domino application is handled by an SSL-terminating Application Gateway that points to a Kubernetes load balancer
-
Domino recommends provisioning with Terraform for extended control and customizability of all resources. When setting up your Azure Terraform provider, add a
partner_id
with a value of31912fbf-f6dd-5176-bffb-0a01e8ac71f2
to enable usage attribution.
This section describes how to configure an AKS cluster for use with Domino.
Resource groups
You can provision the cluster, storage, and application gateway in an existing resource group. When creating the cluster, Azure will create a separate resource group that will contain the cluster components themselves.
Namespaces
No namespace configuration is necessary prior to install. Domino will create three namespaces in the cluster during installation, according to the following specifications:
Namespace | Contains |
---|---|
| Durable Domino application, metadata, platform services required for platform operation |
| Ephemeral Domino execution pods launched by user actions in the application |
| Domino installation metadata and secrets |
Node pools
The AKS cluster must have at least two node pools that produce worker nodes with the following specifications and distinct node labels, and it might include an optional GPU pool:
Pool | Min-Max | VM | Disk | Labels |
---|---|---|---|---|
| 1-4 | Standard_DS5_v2 | 128G |
|
| 1-20 | Standard_DS4_v2 | 128G |
|
Optional: | 0-5 | Standard_NC6 | 128G |
|
The recommended architecture configures the cluster’s initial default node pool with the correct label and size to serve as the platform node pool. See the below cluster Terraform resource for a complete example.
resource "azurerm_kubernetes_cluster" "aks" {
name = example_cluster
enable_pod_security_policy = false
location = "East US"
resource_group_name = "example_resource_group"
dns_prefix = "example_cluster"
private_cluster_enabled = false
default_node_pool {
enable_node_public_ip = false
name = "platform"
node_count = 4
node_labels = { "dominodatalab.com/node-pool" : "platform" }
vm_size = "Standard_DS5_v2"
availability_zones = ["1", "2", "3"]
max_pods = 250
os_disk_size_gb = 128
node_taints = []
enable_auto_scaling = true
min_count = 1
max_count = 4
}
network_profile {
load_balancer_sku = "Standard"
network_plugin = "azure"
network_policy = "calico"
dns_service_ip = "100.97.0.10"
docker_bridge_cidr = "172.17.0.1/16"
service_cidr = "100.97.0.0/16"
}
}
A separate node pool for Domino default compute must be added after
the cluster is created. This is not the initial cluster
default node pool, but a separate node pool named default
that is
added to serve default Domino compute. See the following node pool Terraform
resource for a complete example.
resource "azurerm_kubernetes_cluster_node_pool" "aks" {
enable_node_public_ip = false
kubernetes_cluster_id = "example_cluster_id"
name = "default"
node_count = 1
vm_size = "Standard_DS4_v2"
availability_zones = ["1", "2", "3"]
max_pods = 250
os_disk_size_gb = 128
os_type = "Linux"
node_labels = {
"domino/build-node" = "true"
"dominodatalab.com/build-node" = "true"
"dominodatalab.com/node-pool" = "default"
}
node_taints = []
enable_auto_scaling = true
min_count = 1
max_count = 20
}
Additional node pools can be added with distinct
dominodatalab.com/node-pool
labels to make other instance types
available for Domino executions. Read
Managing the Domino compute grid to learn how these different node types are
referenced by label from the Domino application. When adding GPU node
pools, keep in mind the Azure guidance and best practices on
using GPU nodes in AKS.
Network plugin
The Domino-hosting cluster must use the Advanced Azure CNI with
network policy enforcement by Calico. See the below network_profile
configuration example.
network_profile {
load_balancer_sku = "Standard"
network_plugin = "azure"
network_policy = "calico"
dns_service_ip = "100.97.0.10"
docker_bridge_cidr = "172.17.0.1/16"
service_cidr = "100.97.0.0/16"
}
Dynamic block storage
AKS clusters come equipped with several kubernetes.io/azure-disk
backed storage classes by default. Domino requires use of premium disks
for adequate input and output performance. The managed-premium
class
that is created by default can be used. Consult the following storage
class specification as an example.
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
labels:
kubernetes.io/cluster-service: "true"
name: managed-premium
selfLink: /apis/storage.k8s.io/v1/storageclasses/managed-premium
parameters:
cachingmode: ReadOnly
kind: Managed
storageaccounttype: Premium_LRS
reclaimPolicy: Delete
volumeBindingMode: Immediate
Persistent blob and data storage
Domino uses one Azure storage account for both blob data and files. See the below configuration for the two resources required, the storage account itself and a blob container inside the account.
resource "azurerm_storage_account" "domino" {
name = "example_storage_account"
resource_group_name = "example_resource_group"
location = "East US"
account_kind = "StorageV2"
account_tier = "Standard"
account_replication_type = "LRS"
access_tier = "Hot"
}
resource "azurerm_storage_container" "domino_registry" {
name = "docker"
storage_account_name = "example_storage_account"
container_access_type = "private"
}
Record the names of these resources for use when installing Domino.
See below for an example configuration file for the Domino installer based on the previous provisioning examples.
schema: '1.0'
name: domino-deployment
version: 4.1.9
hostname: domino.example.org
pod_cidr: '100.97.0.0/16'
ssl_enabled: true
ssl_redirect: true
request_resources: true
enable_network_policies: true
enable_pod_security_policies: true
create_restricted_pod_security_policy: true
namespaces:
platform:
name: domino-platform
annotations: {}
labels:
domino-platform: 'true'
compute:
name: domino-compute
annotations: {}
labels: {}
system:
name: domino-system
annotations: {}
labels: {}
ingress_controller:
create: true
gke_cluster_uuid: ''
storage_classes:
block:
create: false
name: managed-premium
type: azure-disk
access_modes:
- ReadWriteOnce
base_path: ''
default: false
shared:
create: true
name: dominoshared
type: azure-file
access_modes:
- ReadWriteMany
efs:
region: ''
filesystem_id: ''
nfs:
server: ''
mount_path: ''
mount_options: []
azure_file:
storage_account: ''
blob_storage:
projects:
type: shared
s3:
region: ''
bucket: ''
sse_kms_key_id: ''
azure:
account_name: ''
account_key: ''
container: ''
gcs:
bucket: ''
service_account_name: ''
project_name: ''
logs:
type: shared
s3:
region: ''
bucket: ''
sse_kms_key_id: ''
azure:
account_name: ''
account_key: ''
container: ''
gcs:
bucket: ''
service_account_name: ''
project_name: ''
backups:
type: shared
s3:
region: ''
bucket: ''
sse_kms_key_id: ''
azure:
account_name: ''
account_key: ''
container: ''
gcs:
bucket: ''
service_account_name: ''
project_name: ''
default:
type: shared
s3:
region: ''
bucket: ''
sse_kms_key_id: ''
azure:
account_name: ''
account_key: ''
container: ''
gcs:
bucket: ''
service_account_name: ''
project_name: ''
enabled: true
autoscaler:
enabled: false
cloud_provider: azure
groups:
- name: ''
min_size: 0
max_size: 0
aws:
region: ''
azure:
resource_group: ''
subscription_id: ''
spotinst_controller:
enabled: false
token: ''
account: ''
external_dns:
enabled: false
provider: aws
domain_filters: []
zone_id_filters: []
git:
storage_class: managed-premium
email_notifications:
enabled: false
server: smtp.customer.org
port: 465
encryption: ssl
from_address: domino@customer.org
authentication:
username: ''
password: ''
monitoring:
prometheus_metrics: true
newrelic:
apm: false
infrastructure: false
license_key: ''
helm:
tiller_image: gcr.io/kubernetes-helm/tiller
appr_registry: quay.io
appr_insecure: false
appr_username: '$QUAY_USERNAME'
appr_password: '$QUAY_PASSWORD'
private_docker_registry:
server: quay.io
username: '$QUAY_USERNAME'
password: '$QUAY_PASSWORD'
internal_docker_registry:
s3_override:
region: ''
bucket: ''
sse_kms_key_id: ''
gcs_override:
bucket: ''
service_account_name: ''
project_name: ''
azure_blobs_override:
account_name: 'example_storage_account'
account_key: 'example_storage_account_key'
container: 'docker'
telemetry:
intercom:
enabled: false
mixpanel:
enabled: false
token: ''
gpu:
enabled: false
fleetcommand:
enabled: false
api_token: ''
teleport_kube_agent:
enabled: false
proxyAddr: teleport-domino.example.org:443
authToken: TOKEN