domino logo
About DominoArchitecture
Kubernetes
Cluster RequirementsDomino on EKSDomino Kubernetes Version CompatibilityDomino on GKEDomino on AKSDomino on OpenShiftNVIDIA DGX in DominoDomino in Multi-Tenant Kubernetes ClusterEncryption in Transit
Installation
Installation ProcessConfiguration ReferenceInstaller Configuration ExamplesPrivate or Offline Installationfleetcommand-agent Release NotesInstall Script Downloads
Configuration
Central ConfigurationNotificationsChange The Default Project For New UsersProject Stage ConfigurationDomino Integration With Atlassian Jira
Compute
Manage Domino Compute ResourcesHardware Tier Best PracticesModel Resource QuotasPersistent Volume ManagementAdding a Node Pool to your Domino ClusterRemove a Node from Service
Keycloak Authentication Service
Operations
Domino Application LoggingDomino MonitoringSizing Infrastructure for Domino
Data Management
Data in DominoData Flow In DominoExternal Data VolumesDatasets AdministrationSubmit GDPR Requests
User Management
RolesView User InformationRun a User Activity ReportSchedule a User Activity Report
Environments
Environment Management Best PracticesCache Environment Images in EKS
Disaster Recovery
Control Center
Control Center OverviewExport Control Center Data with The API
Troubleshooting
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining
Admin Guide
>
Kubernetes
>
Domino on GKE

Domino on GKE

Domino can run on a Kubernetes cluster provided by the Google Kubernetes Engine (GKE).

When running on GKE, the Domino architecture uses GCP resources to fulfill the Domino cluster requirements as follows:

  • Kubernetes control is managed by the GKE cluster

  • Domino uses one node pool of three n1-standard-8 worker nodes to host the Domino platform

  • Additional node pools host elastic compute for Domino executions with optional GPU accelerators

  • Cloud Filestore is used to store user data, backups, logs, and Domino Datasets

  • A Cloud Storage Bucket is used to store the Domino Docker Registry.

  • The kubernetes.io/gce-pd provisioner is used to create persistent volumes for Domino executions.

Set up a GKE cluster for Domino

This section describes how to configure an GKE cluster for use with Domino.

Namespaces

No namespace configuration is necessary prior to install. Domino will create the following namespaces in the cluster during installation, according to the following specifications:

NamespaceContains

platform

Durable Domino application, metadata, platform services required for platform operation

compute

Ephemeral Domino execution pods launched by user actions in the application

domino-system

Domino installation metadata and secrets

Node pools

The GKE cluster must have at least two node pools that produce worker nodes with the following specifications and distinct node labels, and it might include an optional GPU pool:

Pool

Min-Max

Instance

Disk

Labels

platform

3-3

n1-standard-8

128G

dominodatalab.com/node-pool: platform

default

1-20

n1-standard-8

400G

dominodatalab.com/node-pool: default domino/build-node: true

Optional: default-gpu

0-5

n1-standard-8

400G

dominodatalab.com/node-pool: default-gpu

If you want to configure the default-gpu pool, you must add a GPU accelerator the node pool. Read the GKE documentation on available accelerators and on deploying a DaemonSet that automatically installs the necessary drivers.

Additional node pools can be added with distinct dominodatalab.com/node-pool labels to make other instance types available for Domino executions. Read Managing the Domino compute grid to learn how these different node types are referenced by label from the Domino application.

Consult the Terraform snippets below for code representations of the required node pools.

Platform pool

resource "google_container_node_pool" "platform" {
  name     = "platform"
  location = $YOUR_CLUSTER_ZONE_OR_REGION
  cluster  = $YOUR_CLUSTER_NAME

  initial_node_count = 3
  autoscaling {
    max_node_count = 3
    min_node_count = 3
  }

  node_config {
    preemptible  = false
    machine_type = "n1-standard-8"

    labels = {
      "dominodatalab.com/node-pool" = "platform"
    }

    disk_size_gb    = 128
    local_ssd_count = 1
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  timeouts {
    delete = "20m"
  }
}

Default compute pool

resource "google_container_node_pool" "compute" {
  name     = "compute"
  location = $YOUR_CLUSTER_ZONE_OR_REGION
  cluster  = $YOUR_CLUSTER_NAME

  initial_node_count = 1
  autoscaling {
    max_node_count = 20
    min_node_count = 1
  }

  node_config {
    preemptible  = false
    machine_type = "n1-standard-8"

    labels = {
      "domino/build-node"            = "true"
      "dominodatalab.com/build-node" = "true"
      "dominodatalab.com/node-pool"  = "default"
    }

    disk_size_gb    = 400
    local_ssd_count = 1
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  timeouts {
    delete = "20m"
  }
}

Optional GPU pool

resource "google_container_node_pool" "gpu" {
  provider = google-beta
  name     = "gpu"
  location = $YOUR_CLUSTER_ZONE_OR_REGION
  cluster  = $YOUR_CLUSTER_NAME

  initial_node_count = 0

  autoscaling {
    max_node_count = 5
    min_node_count = 0
  }

  node_config {
    preemptible  = false
    machine_type = "n1-standard-8"

    guest_accelerator {
      type  = "nvidia-tesla-p100"
      count = 1
    }

    labels = {
      "dominodatalab.com/node-pool" = "default-gpu"
    }

    disk_size_gb    = 400
    local_ssd_count = 1

    workload_metadata_config {
      node_metadata = "GKE_METADATA_SERVER"
    }
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  timeouts {
    delete = "20m"
  }
}

Network policy enforcement

Domino relies on Kubernetes network policies to manage secure communication between pods in the cluster. By default, the network plugin in GKE will not enforce these policies. To run Domino securely on GKE, you must enable enforcement of network policies.

See the GKE documentation for instructions on enabling network policy enforcement for your cluster.

Dynamic block storage

The Domino installer will automatically create a storage class like the example below for use provisioning GCE persistent disks as execution volumes. No manual setup is necessary for this storage class.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: dominodisk
parameters:
  replication-type: none
  type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Shared storage

A Cloud File store instance must be provisioned with at least 10T of capacity and it must be configured to allow access from the cluster. You will provide the IP address and mount path of this instance to the Domino installer, and it will create an NFS storage class like the following.

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  labels:
    app.kubernetes.io/instance: nfs-client-provisioner
    app.kubernetes.io/managed-by: Tiller
    app.kubernetes.io/name: nfs-client-provisioner
    helm.sh/chart: nfs-client-provisioner-1.2.6-0.1.4
  name: domino-shared
parameters:
  archiveOnDelete: "false"
provisioner: cluster.local/nfs-client-provisioner
reclaimPolicy: Delete
volumeBindingMode: Immediate

Docker registry storage

You will need one Cloud Storage Bucket accessible from your cluster to be used for storing the internal Domino Docker Registry.

Domain

Domino must be configured to serve from a specific FQDN. To serve Domino securely over HTTPS, you will also need an SSL certificate that covers the chosen name. Record the FQDN for use when installing Domino. Once Domino is deployed into your cluster, you must set up DNS for this name to point to an HTTPS Cloud Load Balancer that has an SSL certificate for the chosen name, and forwards traffic to port 80 on your platform nodes.

Domino Data LabKnowledge BaseData Science BlogTraining
Copyright © 2022 Domino Data Lab. All rights reserved.