Hardware Tier best practices¶
Domino Hardware Tiers define Kubernetes requests and limits and link them to specific node pools. We recommend the following best practices.
- Accounting for overhead
- Isolating workloads and users using node pools
- Setting resource requests and limits to the same values
When designing hardware tiers, you need to take into account what resources will be available on a given node when Domino submits your workload for execution. Not all physical memory and CPU cores of your node will be available due to system overhead.
You should consider the following overhead components:
- Kubernetes management overhead
- Domino daemon-set overhead
- Domino execution sidecar overhead
Kubernetes typically reserves a portion of each node’s capacity for daemons and pods that are required to for Kubernetes itself. The amount of reserved resources usually scales with the size of the node, and also depends on the Kubernetes provider or distribution.
Click the links below to view information on reserved resources for cloud-provider managed Kubernetes offerings:
The best way to understand the available resources for your instance is to check one of your compute
nodes with the
kubectl describe nodes command and then look for the
Allocatable section of the output.
It will show the memory and CPU available for Domino.
Domino runs a set of management pods that reside on each of the compute nodes. These are used for things like log aggregation, monitoring, and environment image caching.
The overhead of these daemon-sets is roughly 0.5 CPU cores and 0.5 Gi RAM. This overhead is taken from the allocatable resources on the node.
Lastly, for each Domino execution, there are a set of supporting containers in the execution pod that manage authentication, handle request routing, loading files, and installing dependencies. These supporting containers make CPU and memory requests that Kubernetes takes into account when scheduling execution pods.
The supporting container overhead currenly is rougly 1 CPU core and 1.5 GiB RAM. This is configurable and may vary for your specific deployment.
Overhead is relevant if you want to define a hardware tier dedicated to one execution at a time per node, such as for a node with a single physical GPU. It is also relevant if you absolutely need to maximize node density.
m5.2xlarge EC2 node with raw capacity of 8 CPU cores and 32 GiB of RAM.
When used as part of an EKS cluster, the node reports the following allocatable capacity of ~27GiB of RAM and 7190m CPU cores.
Capacity: attachable-volumes-aws-ebs: 25 cpu: 8 ephemeral-storage: 104845292Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32120476Ki pods: 58 Allocatable: attachable-volumes-aws-ebs: 25 cpu: 7910m ephemeral-storage: 95551679124 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 28372636Ki pods: 58
On top of that above, conservatively account for 500m CPU and 0.5GiB of RAM for the Domino and EKS daemons.
Lastly, for a single execution add 1000m CPU and and 1.5GiB RAM for sidecars, and you are left with rougly 6900m CPU and 25GiB RAM that you can use for a single large hardware tier.
If you want to partition the node into smaller hardware tiers, you will need to account for the sidecar overhear for every execution that you want to colocate.
As a general rule, larger nodes allow for more flexibility as Kubernetes will take care of efficiently packing your executions onto the available capacity.
You can see which pods are running on a specific node by visiting the Infrastructure admin page and clicking on the name of the node. In the image below, there is a box around the execution pods. The other pods handle logging, caching, and other services.
[ Click to view full size ]
Node pools are defined by labels added to nodes in a specific format:
In the hardware tier form, you just need to include
your-node-pool. You can name a node pool anything you like,
but we recommend naming them something meaningful given the intended use.
Domino typically comes pre-configured with
default-gpu node pools, with the assumption that most
user executions will run on nodes in one of those pools. As your compute needs become more sophisticated, you may want to
keep certain users separate from one another or provide specialized hardware to certain groups of users.
So if there’s a
data science team in New York City that needs a specific GPU machine that other teams don’t need, you could use
the following label for the appropriate nodes:
dominodatalab.com/node-pool=nyc-ds-gpu. In the hardware tier form,
you would specify
nyc-ds-gpu. To ensure only that team has access to those machines, create a
organization, add the correct users to the organization, and give that organization access to the new hardware tier that
nyc-ds-gpu node pool label.
With Kubernetes, resource limits must be >= resource requests. So if your memory request is 1000 GiB, your limit must be >= 1000 GiB. But while setting a request > limit can be useful - there are cases where allowing bursts of CPU or memory can be useful - this is also dangerous. Kubernetes may evict a pod using more resources than initially requested. For Domino workspaces or jobs, this would cause the execution to be terminated.
It is for this reason that we recommend setting memory and CPU requests equal to limits. In this case, Python and R cannot allocate more memory than the limit, and execution pods will not be evicted.
On the other hand, if the limit is higher than the request, it is possible for a user to use resources that another user’s execution pod should be able to access. This is the “noisy neighbor” problem that you may have experienced in other multi-user environments. But instead of allowing the noisy neighbor to degrade performance for other pods on the node, Kubernetes will evict offending pod when necessary to free up resources.
User data on disk will not be lost, because Domino stores user data on a persistent volume that can be reused. But anything in memory will be lost and the execution will have to be restarted.