Domino Hardware Tiers define Kubernetes requests and limits and link them to specific node pools. Domino recommends the following best practices.
-
Account for overhead
-
Isolate workloads and users using node pools
-
Set resource requests and limits to the same values
When defining hardware tiers, you might have to leave room for the overhead it takes to manage a node and its executions. Typically, node overhead is 1.5 core and 2 GiB of RAM. Additional overhead per execution is about 1 core and 1 GiB.
Overhead is relevant if you want to define a hardware tier dedicated to one execution at a time per node, such as for a node with a single physical GPU. It is also relevant if you have to maximize node density. More commonly, for smaller hardware tiers, overhead isn’t much of a concern.
Where does overhead go?
-
Host OS, Docker, and so on. (1 core and 1.5 GiB of RAM)
-
Domino-specific management pods for logging, caching, and so on (0.5 cores and 0.5 GiB of RAM)
-
Execution sidecars (1 core and 1 GiB of RAM)
For each node, there are Domino-specific pods for logging and cache management. For each workspace, for example, there are sidecar containers that manage authentication and request routing, ensure files are in the right place, and make sure dependencies get installed. Domino services and execution sidecars make CPU and memory requests that Kubernetes takes into account when scheduling execution pods.
If Domino is running on your own Kubernetes cluster, you might have additional overhead.
Examples
An 8-core, 32-GiB node can accept:
-
A single execution using a hardware tier requesting 6.5 core and 29 GiB of RAM.
or
-
Three executions using a hardware tier requesting 2 CPUs and 8 GiB of RAM.
You can optimize the second hardware tier further, to squeeze four simultaneous executions onto a node, but simpler might be better. If your users also use smaller hardware tier definitions, Kubernetes will do the "Tetris" required to soak up excess capacity on a node. So you can find a node with, for example, 3 executions using a hardware tier requesting 2 cores and 8 GiB of RAM each, as well as 2 requesting 0.5-cores, 1 GiB hardware tiers.
-
In the Admin application, click Infrastructure.
-
Click the name of a node. In the following image, there is a box around the execution pods. The other pods handle logging, caching, and other services.
-
Go to Advanced > Hardware Tiers.
-
Create or edit a hardware tier.
-
In the Node Pool field, enter
your-node-pool
which must match the node pool label such as:dominodatalab.com/node-pool=<your-node-pool>
. You can name a node pool anything you like, but Domino recommends naming them something meaningful given the intended use.
Domino typically comes pre-configured with default
and default-gpu
node pools, with the assumption that most user executions will run on nodes in one of those pools.
As your compute requirements become more sophisticated, you might want to keep certain users separate from one another or provide specialized hardware to certain groups of users.
For example, if there’s a data science team in New York City that needs a specific GPU machine that other teams don’t need it, you can use the following label for the appropriate nodes: dominodatalab.com/node-pool=nyc-ds-gpu
.
In the hardware tier form, you would specify nyc-ds-gpu
.
To ensure only that team has access to those machines, create a NYC
organization, add the correct users to the organization, and give that organization access to the new hardware tier that uses the nyc-ds-gpu
node pool label.
With Kubernetes, resource limits must be greater than or equal to resource requests. So, if your memory request is 16 GiB, your limit must be greater than or equal to 16 GiB. But, although setting a request greater than limits can be useful, and there are cases where allowing bursts of CPU or memory can be useful, this is also dangerous. Kubernetes might evict a pod using more resources than initially requested. For Domino workspaces or jobs, this would cause the execution to be terminated.
For this reason, Domino recommends setting memory and CPU requests equal to limits. In this case, Python and R cannot allocate more memory than the limit, and execution pods will not be evicted.
On the other hand, if the limit is higher than the request, a user can use resources that another user’s execution pod must be able to access. This is the noisy neighbor problem that you might have experienced in other multi-user environments. But, instead of allowing the noisy neighbor to degrade performance for other pods on the node, Kubernetes will evict offending pods when necessary to free up resources.
User data on disk will not be lost, because Domino stores user data on a persistent volume that can be reused. But, anything in memory will be lost and the execution will have to be restarted.