The following advanced Hardware Tier settings provide you more control.
- Restrict to compute cluster
You can dynamically orchestrate on-demand compute clusters on top of the Domino managed infrastructure. These workloads often have unique resource requirements, and you might want to create dedicated Hardware Tiers that will be available only for compute cluster workloads. In this case, use the Restrict to compute cluster setting to specify the compute cluster types such as Spark or Ray that must exclusively use a given Hardware Tier.
When configuring resources for a cluster, the options will be limited to Hardware Tiers where the Restrict to compute cluster setting is selected for the cluster type. Hardware Tiers configured this way will not be available when selecting resources outside of cluster configuration.
Note
|
For a given cluster type, Hardware Tier choices will be filtered only after at least one restricted Hardware Tier is configured. |
- Maximum Simultaneous Executions
-
If you want to limit the deployment-wide capacity for a Hardware Tier, but you do not want to create a dedicated node pool for the Hardware Tier, you can specify the Maximum Simultaneous Executions. This ensures that no more than the specified number of executions can use the selected Hardware Tier at the same time. Additional executions beyond the limit will be queued.
- Overprovisioning Pods
-
On cloud deployments enabled for autoscaling, when a capacity request is made, new nodes are provisioned. This option minimizes cost, but provisioning a new node can take several minutes causing users to wait. The situation can be particularly time-consuming in the mornings when many users first log onto a system that has scaled down overnight. _ You can address this problem by overprovisioning several warm slots for popular Hardware Tiers. Domino will automatically pre-provision nodes that might be necessary to accommodate the specified number of overprovisioned executions using this Hardware Tier. This minimizes the chance that a user must wait for a new node to spin up. To keep costs under control, you can apply overprovisioning on a scheduled basis for periods when many new users are expected.
To do this, set the number of Overprovisioning Pods and select the Enable Overprovisioning Pods On a Schedule checkbox. Then, specify the schedule as needed.
- Use custom GPU resource name
-
By default, Domino requests GPU resources of type
nvidia.com/gpu
. This works well for most NVIDIA GPU-enabled devices, but when your deployment is backed by different GPU devices (for example, NVIDIA MIG GPUs, AMD GPUs, AWS vGPUs, Xilinx FPGA), you must use a different name for the GPU resources.Select Use custom GPU resource name and enter the GPU resource name that corresponds to the name of the GPU devices being discovered and reported by Kubernetes.
For example, with an NVIDIA A100 GPU configured in MIG Mixed Mode, you can use resources like
nvidia.com/mig-1g.5gb
,nvidia.com/mig-2g.10gb
, ornvidia.com/mig-3g.20gb
. - Allow executions to exceed the default shared memory limit
-
You can allow hardware tiers to exceed the default limit of 64MB for shared memory. This is especially beneficial for applications that can use shared memory. To enable this, select the Allow executions to exceed the default shared memory limit checkbox. This overrides the
/dev/shm
(shared memory) limit, and any shared memory consumption will count toward the overall memory limit of the hardware tier. Incorporate the size of/dev/shm
in any memory usage calculations for a hardware tier with this option enabled.
Warning
|
|