Cluster requirements

You can deploy Domino into a Kubernetes cluster that meets the following requirements.

Cluster permissions

Domino needs permission to install and configure pods in the cluster through Helm. The Domino installer is delivered as a containerized Python utility that operates Helm through a kubeconfig that provides service account access to the cluster.

Namespaces

Domino creates one dedicated namespace for Platform nodes, one for Compute nodes, and one for installer metadata and secrets.

Nodes

The recommended node requirements for Domino are one platform node and one compute node.

If you are upgrading to 5.0 or higher, and the platform cluster does not support autoscaling, you must increase the number of nodes to support the increased node requirements.

Storage requirements

Storage classes

Domino requires at least two storage classes.

Dynamic block storage

Domino requires high-performance block storage for the following types of data:

Ephemeral volumes attached to user execution
High-performance databases for Domino application object data

This storage must be backed by a storage class with the following properties:

Supports dynamic provisioning
Can be mounted on any node in the cluster
SSD-backed recommended for fast I/O
Capable of provisioning volumes of at least 100GB
Underlying storage provider can support ReadWriteOnce semantics
Are backed by true, fully POSIX-compliance block storage (i.e., NOT NFS)

Note	If this storage does not meet these requirements — or if you override critical services that rely on block storage (mongo, postgres, Git) to use a different storage class — you may see performance degradations, catastrophic failures, and unexpected data loss.

By default, this storage class is named dominodisk.

In AWS, EBS is used to back this storage class. The following is an example configuration for a compatible EBS storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: domino-compute-storage
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  fsType: ext4

In GCP, compute engine persistent disks are used to back this storage class. The following is an example configuration for a compatible GCEPD storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: dominodisk
parameters:
  replication-type: none
  type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Long-term shared storage

Domino needs a separate storage class for long-term storage for:
- Project data uploaded or created by users
- Domino Datasets
- Docker images
- Domino backups
This storage needs to be backed by a storage class with the following properties:
- Dynamically provisions Kubernetes PersistentVolume
- Can be accessed in ReadWriteMany mode from all nodes in the cluster
- Uses a VolumeBindingMode of Immediate
In AWS for example, these storage requirements are handled by a class that is backed by EFS for Domino Datasets, and a class that is backed by S3 for project data, backups, and Docker images.

By default, this storage class is named dominoshared.

Native

Domino requires cloud-provider native object storage for a few resources and services:

Blob Storage. For AWS, the blob storage must be backed by S3 (see Blob storage). For other infrastructure, the dominoshared storage class can be used.
Logs. For AWS, the log storage must be backed by S3 (see Blob storage). For others, the dominoshared storage class can be used.
Costs. For AWS only, the log storage must be backed by S3 (see Blob storage).
Backups. For all supported cloud providers, storage for backups is backed by the native blob store. For on-prem, backups are backed by the dominoshared storage class.
- AWS: S3
- Azure: Azure Blob Storage
- GCP: GCP Cloud Storage
Datasets. For AWS, Datasets storage must be backed by EFS (see Datasets storage). For other infrastructure, the dominoshared storage class can be used.

On-Prem

In On-Prem environments, a wide variety of block and file-based storage is used by customers. We expected that dominodisk is backed by block storage (not NFS), ideally matching the requirements of Dynamic block storage defined above. In some cases, host volumes can be used for backing services like Git, Postgres, and MongoDB. Note that Postgres and MongoDB provide state replication. Host volumes can be used for Runs, but network-attached block storage is preferred to leverage files cached in block storage that is portable between nodes. If host volumes are used for Runs, file caching must be disabled and you will potentially expect slow start-up executions for large projects.

Node pool requirements

Domino requires a minimum of two node pools, one to host the Domino Platform and one to host Compute workloads. Additional optional pools can be added to provide specialized execution hardware for some Compute workloads.

Platform pool requirements
- Boot Disk: 128GB
- Min Nodes: 3
- Max Nodes: 3
- Spec: 8 CPU / 32GB
- Labels: dominodatalab.com/node-pool: platform
- Tags:
  - kubernetes.io/cluster/{{ cluster_name }}: owned
  - k8s.io/cluster-autoscaler/enabled: true #Optional for autodiscovery
  - k8s.io/cluster-autoscaler/{{ cluster_name }}: owned #Optional for autodiscovery
Compute pool requirements
- Boot Disk: 400GB
- Recommended Min Nodes: 1
- Max Nodes: Set as necessary to meet demand and resourcing needs
- Recommended min spec: 8 CPU / 32GB
- Enable Autoscaling: Yes
- Labels: domino/build-node: true, dominodatalab.com/node-pool: default
- Tags:
  - k8s.io/cluster-autoscaler/node-template/label/dominodatalab.com/node-pool: default
  - kubernetes.io/cluster/{{ cluster_name }}: owned
  - k8s.io/cluster-autoscaler/node-template/label/domino/build-node: true
  - k8s.io/cluster-autoscaler/enabled: true #Optional for autodiscovery
  - k8s.io/cluster-autoscaler/{{ cluster_name }}: owned #Optional for autodiscovery
Optional GPU compute pool
- Boot Disk: 400GB
- Recommended Min Nodes: 0
- Max Nodes: Set as necessary to meet demand and resourcing needs
- Recommended min Spec: 8 CPU / 16GB / One or more NVIDIA GPU Device
- Nodes must be pre-configured with the appropriate NVIDIA driver, NVIDIA-docker2, and set the default docker runtime to nvidia. For example, EKS GPU optimized AMI.
- Labels: dominodatalab.com/node-pool: default-gpu, nvidia.com/gpu: true
- Tags:
  - k8s.io/cluster-autoscaler/node-template/label/dominodatalab.com/node-pool: default-gpu
  - kubernetes.io/cluster/{{ cluster_name }}: owned
  - k8s.io/cluster-autoscaler/enabled: true #Optional for autodiscovery
  - k8s.io/cluster-autoscaler/{{ cluster_name }}: owned #Optional for autodiscovery

Cluster networking

To manage secure communication between pods, Domino relies on Kubernetes network policies. You must use a networking solution that supports the Kubernetes NetworkPolicy resource. One such solution is Calico.

Ingress and SSL

Domino must be configured to serve from a specific FQDN, and DNS for that name must resolve to the address of an SSL-terminating load balancer with a valid certificate.

Important

A Domino install can’t be hosted on a subdomain of another Domino install. For example, if you have Domino deployed at data-science.example.com, you can’t deploy another instance of Domino at acme.data-science.example.com.

The load balancer must target incoming connections on ports 80 and 443 to port 80 on all nodes in the Platform pool. This load balancer must support WebSocket connections.

In order for Domino to correctly detect the protocol of incoming requests, the SSL-terminating load balancer must properly set the X-Forwarded-Proto header. Domino does not currently support the alternative Forwarded header.

Health checks for this load balancer must use HTTP on port 80 and check for 200 responses from a path of /healthz.

Note	Domino continues to support Environments with subdomains. If you are using subdomains for your Domino deployment and need best-practice information, contact your Account Manager. However, Domino recommends that you do not use them for improved security.

NTP

To support SSO protocols, TLS connections to external services, intra-cluster TLS when using Istio, and to avoid general interoperability issues, the nodes in your Kubernetes cluster must have a valid Network Time Protocol (NTP) configuration. This will allow for successful TLS validation and operation of other time-sensitive protocols.