The following advanced hardware tier settings provide you more control.
You can dynamically orchestrate on-demand compute clusters on top of the Domino managed infrastructure. These workloads often have unique resource requirements, and you might want to create dedicated hardware tiers that will be available only for compute cluster workloads. In this case, use the Restrict to compute cluster setting to specify the compute cluster types, such as Spark or Ray, that must exclusively use a given hardware tier.
When configuring resources for a cluster, the options will be limited to hardware tiers where the Restrict to compute cluster setting is selected for the cluster type. Hardware tiers configured this way will not be available when selecting resources outside of cluster configuration.
Note
|
For a given cluster type, hardware tier choices will be filtered only after at least one restricted hardware tier is configured. |
If you want to limit the deployment-wide capacity for a hardware tier, but you do not want to create a dedicated node pool for the hardware tier, you can specify the Maximum Simultaneous Executions.
This ensures that no more than the specified number of executions can use the selected hardware tier at the same time. Additional executions beyond the limit will be queued.
On cloud deployments enabled for autoscaling, when a capacity request is made, new nodes are provisioned. This option minimizes cost, but provisioning a new node can take several minutes causing users to wait. The situation can be particularly time-consuming in the mornings when many users first log onto a system that has scaled down overnight.
You can address this problem by overprovisioning several warm slots for popular hardware tiers. Domino will automatically pre-provision nodes that might be necessary to accommodate the specified number of overprovisioned executions using this hardware tier. This minimizes the chance that a user must wait for a new node to spin up.
To keep costs under control, you can apply overprovisioning on a scheduled basis for periods when many new users are expected.
To do this:
-
Set the number of Overprovisioning Pods.
-
Select the Enable Overprovisioning Pods On a Schedule checkbox.
-
Specify the schedule as needed.
By default, Domino requests GPU resources of type nvidia.com/gpu
.
This works well for most NVIDIA GPU-enabled devices, but when your deployment is backed by different GPU devices (for example, NVIDIA MIG GPUs, AMD GPUs, AWS vGPUs, or Xilinx FPGA), you must use a different name for the GPU resources.
Select Use custom GPU resource name and enter the GPU resource name that corresponds to the name of the GPU devices being discovered and reported by Kubernetes.
For example, with an NVIDIA A100 GPU configured in MIG Mixed Mode, you can use resources like nvidia.com/mig-1g.5gb
, nvidia.com/mig-2g.10gb
, or nvidia.com/mig-3g.20gb
.
You can allow hardware tiers to exceed the default limit of 64MB for shared memory. This is especially beneficial for applications that can use shared memory.
To enable this, select the Allow executions to exceed the default shared memory limit checkbox.
This overrides the /dev/shm
(shared memory) limit, and any shared memory consumption will count toward the overall memory limit of the hardware tier.
Incorporate the size of /dev/shm
in any memory usage calculations for a hardware tier with this option enabled.
If this option is enabled, you must input the shared memory limit.
For example, suppose the hardware tier has an overall memory limit of 4 GB and you want to limit shared memory to 2 GB, then you should input "2.0" for the shared memory limit.
Setting a shared memory limit greater than the overall memory limit will only allow usage of shared memory up to the overall limit. For example, if you set a shared memory limit of 6 GB when the hardware tier has an overall memory limit of 4 GB, an execution on the hardware tier can still only use up to 4 GB of shared memory.
Warning
|
|
When creating or editing a hardware tier in the Admin UI, there is a small Advanced link near the bottom of the form. Clicking this link will reveal text inputs for the following attributes:
-
Pod resource requests
-
Pod resource limits
-
Pod annotations
-
Pod labels
-
Hugepages
-
Capabilities
These attributes apply to all Execution pods, including Workspaces, Jobs, and Apps, as well as Compute Cluster pods. However, they do not apply to Model API pods.
Note
|
These attributes cannot override any existing Domino-managed fields and will be ignored. |
Pod resource requests / Pod resource limits
Resource requests and limits are applied to the run
container of Execution pods and the master/worker containers of Compute Cluster pods.
The inputs should consist of newline-delimited key-value pairs in the form key: value
.
For example:
ephemeral-storage: 10Gi
Note
|
Administrators should not specify any CPU or memory requests/limits here, as these values will be overridden by the
existing form inputs: |
Pod annotations
Annotations are applied to all Execution pods and Compute Cluster pods.
The input should consist of newline-delimited key-value pairs in the form key: value
.
For example:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "9090"
This translates to the following pod specification:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "9090"
Pod labels
Labels are applied to all Execution pods and Compute Cluster pods.
The input should consist of newline-delimited key-value pairs in the form key: value
.
For example:
example.com/application: my-app
example.com/environment: production
This translates to the following pod specification:
metadata:
labels:
example.com/application: my-app
example.com/environment: production
Hugepages
Hugepages are applied to the run
container of Execution pods and the master/worker containers of Compute Cluster pods.
The input should consist of a key-value pair in the form page_size: total_size
.
For example:
2Mi: 100Mi
This translates to the following pod specification:
containers:
resources:
limits:
hugepages-2Mi: 100Mi
requests:
hugepages-2Mi: 100Mi
volumeMounts:
- mountPath: /hugepages-2Mi
name: hugepages-2mi
volumes:
- name: hugepages-2mi
emptyDir:
medium: HugePages-2Mi
Capabilities
Capabilities are applied to the run
container of Execution pods and the master/worker containers of Compute Cluster pods.
The input should consist of a newline-delimited list of capabilities.
For example:
IPC_LOCK
NET_BIND_SERVICE
This translates to the following pod specification:
securityContext:
capabilities:
add:
- IPC_LOCK
- NET_BIND_SERVICE
Note
|
In order for the capabilities to be added, the Central Config setting
|