There might be times when you have to remove a specific node (or multiple nodes) from service, either temporarily or permanently. This might include cases of troubleshooting nodes that are in a bad state, or retiring nodes after an update to the AMI so that all nodes are using the new AMI.
This topic describes how to temporarily prevent new workloads from being assigned to a node, as well as how to safely remove workloads from a node so that it can be permanently retired.
kubectl cordon <node> command prevents additional pods from being scheduled onto the node, without disrupting any of the pods currently running on it.
For example, let’s say a new node in your cluster has come up with some problems, and you want to cordon it before launching any new runs to ensure they will not land on that node.
The procedure might look like this:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-192-168-0-221.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5 ip-192-168-17-8.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5 ip-192-168-24-46.us-east-2.compute.internal Ready <none> 51m v1.14.7-eks-1861c5 ip-192-168-3-110.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5 $ kubectl cordon ip-192-168-24-46.us-east-2.compute.internal node/ip-192-168-24-46.us-east-2.compute.internal cordoned $ kubectl get no NAME STATUS ROLES AGE VERSION ip-192-168-0-221.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5 ip-192-168-17-8.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5 ip-192-168-24-46.us-east-2.compute.internal Ready,SchedulingDisabled <none> 53m v1.14.7-eks-1861c5 ip-192-168-3-110.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5
SchedulingDisabled status on the cordoned node.
You can undo this and return the node to service with the command:
kubectl cordon <node>
Identify user workloads
Before removing a node from service permanently, you must ensure there are no workloads still running on it that must not be disrupted. For example, you might see the following workloads running on a node (notice the specification of the compute namespace with
-nand wide output to include the node hosting the pod with
$ kubectl get po -n domino-compute -o wide | grep ip-192-168-24-46.us-east-2.compute.internal run-5e66acf26437fe0008ca1a88-f95mk 2/2 Running 0 23m 192.168.4.206 ip-192-168-24-46.us-east-2.compute.internal <none> <none> run-5e66ad066437fe0008ca1a8f-629p9 3/3 Running 0 24m 192.168.28.87 ip-192-168-24-46.us-east-2.compute.internal <none> <none> run-5e66b65e9c330f0008f70ab8-85f4f5f58c-m46j7 3/3 Running 0 51m 192.168.23.128 ip-192-168-24-46.us-east-2.compute.internal <none> <none> model-5e66ad4a9c330f0008f709e4-86bd9597b7-59fd9 2/2 Running 0 54m 192.168.28.1 ip-192-168-24-46.us-east-2.compute.internal <none> <none> domino-build-5e67c9299c330f0008f70ad1 1/1 Running 0 3s 192.168.13.131 ip-192-168-24-46.us-east-2.compute.internal <none> <none>
Different types of workloads must be treated differently.
To see the details of a specific workload, run the following command:
kubectl describe po run-5e66acf26437fe0008ca1a88-f95mk -n domino-compute
The labels section of the describe output is particularly useful to distinguish the type of workload, as each of the workloads named as
run-…will have a label like
dominodatalab.com/workload-type=<type of workload>..
The previous example contains one each of the major user workloads:
run-5e66acf26437fe0008ca1a88-f95mkis a Job, with label
dominodatalab.com/workload-type=Batch. It will stop running on its own once it is finished and disappear from the list of active workloads.
run-5e66ad066437fe0008ca1a8f-629p9, is a Workspace, with label
dominodatalab.com/workload-type=Workspace. It will keep running until the user who launched it shut it down. You can contact users to shut down their workspaces, waiting a day or two for them to shut them down, or remove the node with the workspaces still running.Caution
run-5e66b65e9c330f0008f70ab8-85f4f5f58c-m46j7, is an App, with the label
dominodatalab.com/workload-type=App. It is a long-running process, and is governed by a kubernetes deployment. It will be recreated automatically if you destroy the node hosting it, but will experience whatever downtime is required for a new pod to be created and scheduled on another node. See below for methods to proactively move the pod and reduce downtime.---------------
model-5e66ad4a9c330f0008f709e4-86bd9597b7-59fd9, is a Model API. It does not have a
dominodatalab.com/workload-typelabel, and instead is easily identifiable by the pod name. It is also a long-running process, similar to an app, with similar concerns. See below for methods to proactively move the pod and reduce downtime.---------------
domino-build-5e67c9299c330f0008f70ad1is an Environment. It will finish on its own and go into a