Making a new node group available to Domino is as simple as adding new
Kubernetes worker nodes with a distinct
label. You can then reference the value of that label when creating new
to configure Domino to assign
executions to those nodes.
See below for an example of creating a scalable node pool in EKS.
new-nodegroup.yamlfile like the one below, and configure it with the properties you want the new group to have. All values shown with a
$are variables that you must modify.
apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: $CLUSTER_NAME region: $CLUSTER_REGION nodeGroups: - name: $GROUP_NAME # this can be any name you choose, it will be part of the ASG and template name instanceType: $AWS_INSTANCE_TYPE minSize: $DMINIMUM_GROUP_SIZE maxSize: $DESIRED_MAXIMUM_GROUP_SIZE volumeSize: 400 # important to allow for image caching on Domino workers availabilityZones: ["$YOUR_CHOICE"] # this should be the same AZ (or the same multiple AZ's) as your other node pools ami: $AMI_ID labels: "dominodatalab.com/node-pool": "$NODE_POOL_NAME" # this is the name you'll reference from Domino # "nvidia.com/gpu": "true" # uncomment this line if this pool uses a GPU instance type tags: "k8s.io/cluster-autoscaler/node-template/label/dominodatalab.com/node-pool": "$NODE_POOL_NAME" # "k8s.io/cluster-autoscaler/node-template/label/nvidia.com/gpu": "true" # uncomment this line if this pool uses a GPU instance type
The AWS tag with key
k8s.io/cluster-autoscaler/node-template/label/dominodatalab.com/node-poolis important for exposing the group to your cluster autoscaler.
Note also that you cannot have compute node pools in separate, isolated AZ’s as this creates volume affinity errors.
Once your configuration file describes the group you want to create, run
eksctl create nodegroup --config-file=new-nodegroup.yaml.
Take the names of the resulting ASG and add them to the
autoscaling.groupssection of your
Run the Domino installer to update the autoscaler.
Create a new hardware tiers in Domino that references the new labels.
When finished, you can start Domino executions that use the new Hardware Tier and those executions will be assigned to nodes in the new group, which will be scaled as configured by the cluster autoscaler.