Set up Domino on GKE

Domino can run on a Kubernetes cluster provided by the Google Kubernetes Engine (GKE).

A map of the Google Cloud architecture you’ll need to set up a Domino deployment

When running on GKE, the Domino architecture uses Google Cloud Provider (GCP) resources to fulfill the Domino cluster requirements as follows:

  • The GKE cluster manages Kubernetes control.

  • Domino uses one node pool of three n1-standard-8 worker nodes to host the Domino platform.

  • Additional node pools host elastic compute for Domino executions with optional GPU accelerators.

  • Cloud Filestore stores user data, backups, logs, and Domino Datasets.

  • A Cloud Storage Bucket stores the Domino Docker Registry.

  • The kubernetes.io/gce-pd provisioner creates persistent volumes for Domino executions.

Set up a GKE cluster for Domino

This section describes how to configure an GKE cluster for use with Domino.

Namespaces

You don’t have to configure namespaces prior to install. Domino creates the following namespaces in the cluster during installation, according to the following specifications:

NamespaceContains

platform

Durable Domino application, metadata, platform services required for platform operation.

compute

Ephemeral Domino execution pods launched by user actions in the application.

domino-system

Domino installation metadata and secrets.

Node pools

The GKE cluster must have at least two node pools that produce worker nodes with the following specifications and distinct node labels, and it might include an optional GPU pool:

PoolMin-MaxInstanceDiskLabels

platform

4-6

n1-standard-8

128G

dominodatalab.com/node-pool: platform

default

1-20

n1-standard-8

400G

dominodatalab.com/node-pool: default domino/build-node: true

Optional: default-gpu

0-5

n1-standard-8

400G

dominodatalab.com/node-pool: default-gpu

If you want to configure the default-gpu pool, you must add a GPU accelerator the node pool. See the GKE documentation about available accelerators and deploying a DaemonSet that automatically installs the necessary drivers.

You can add node pools with distinct dominodatalab.com/node-pool labels to make other instance types available for Domino executions. See Manage Compute Resources to learn how these different node types are referenced by labels from the Domino application.

Consult the following Terraform snippets for code representations of the required node pools.

Platform pool

resource "google_container_node_pool" "platform" {
  name     = "platform"
  location = $YOUR_CLUSTER_ZONE_OR_REGION
  cluster  = $YOUR_CLUSTER_NAME

  initial_node_count = 3
  autoscaling {
    max_node_count = 3
    min_node_count = 3
  }

  node_config {
    preemptible  = false
    machine_type = "n1-standard-8"

    labels = {
      "dominodatalab.com/node-pool" = "platform"
    }

    disk_size_gb    = 128
    local_ssd_count = 1
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  timeouts {
    delete = "20m"
  }
}

Default compute pool

resource "google_container_node_pool" "compute" {
  name     = "compute"
  location = $YOUR_CLUSTER_ZONE_OR_REGION
  cluster  = $YOUR_CLUSTER_NAME

  initial_node_count = 1
  autoscaling {
    max_node_count = 20
    min_node_count = 1
  }

  node_config {
    preemptible  = false
    machine_type = "n1-standard-8"

    labels = {
      "domino/build-node"            = "true"
      "dominodatalab.com/build-node" = "true"
      "dominodatalab.com/node-pool"  = "default"
    }

    disk_size_gb    = 400
    local_ssd_count = 1
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  timeouts {
    delete = "20m"
  }
}

Optional GPU pool

resource "google_container_node_pool" "gpu" {
  provider = google-beta
  name     = "gpu"
  location = $YOUR_CLUSTER_ZONE_OR_REGION
  cluster  = $YOUR_CLUSTER_NAME

  initial_node_count = 0

  autoscaling {
    max_node_count = 5
    min_node_count = 0
  }

  node_config {
    preemptible  = false
    machine_type = "n1-standard-8"

    guest_accelerator {
      type  = "nvidia-tesla-p100"
      count = 1
    }

    labels = {
      "dominodatalab.com/node-pool" = "default-gpu"
    }

    disk_size_gb    = 400
    local_ssd_count = 1

    workload_metadata_config {
      node_metadata = "GKE_METADATA_SERVER"
    }
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  timeouts {
    delete = "20m"
  }
}

Network policy enforcement

Domino relies on Kubernetes network policies to manage secure communication between pods in the cluster. By default, the network plugin in GKE will not enforce these policies. To run Domino securely on GKE, you must enable enforcement of network policies.

See the GKE documentation for instructions about how to enable network policy enforcement for your cluster.

Dynamic block storage

The Domino installer will automatically create a storage class like the following example for use provisioning GCE persistent disks as execution volumes. No manual setup is necessary for this storage class.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: dominodisk
parameters:
  replication-type: none
  type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Shared storage

A Cloud File store instance must be provisioned with at least 10T of capacity and it must be configured to allow access from the cluster. You must provide the IP address and mount path of this instance to the Domino installer, and it will create an NFS storage class like the following.

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  labels:
    app.kubernetes.io/instance: nfs-client-provisioner
    app.kubernetes.io/managed-by: Tiller
    app.kubernetes.io/name: nfs-client-provisioner
    helm.sh/chart: nfs-client-provisioner-1.2.6-0.1.4
  name: domino-shared
parameters:
  archiveOnDelete: "false"
provisioner: cluster.local/nfs-client-provisioner
reclaimPolicy: Delete
volumeBindingMode: Immediate

Docker registry storage

You need one Cloud Storage Bucket accessible from your cluster to be used to store the internal Domino Docker Registry.

Domain

You must configure Domino to serve from a specific FQDN. To serve Domino securely over HTTPS, you need an SSL certificate that covers the chosen name. Record the FQDN for use when you install Domino.

Important
A Domino install can’t be hosted on a subdomain of another Domino install. For example, if you have Domino deployed at data-science.example.com, you can’t deploy another instance of Domino at acme.data-science.example.com.

After Domino is deployed into your cluster, you must set up DNS for this name to point to an HTTPS Cloud Load Balancer that has an SSL certificate for the chosen name, and forwards traffic to port 80 on your platform nodes.

Next steps