There might be times when you have to remove a specific node (or multiple nodes) from service, either temporarily or permanently. This might include cases of troubleshooting nodes that are in a bad state, or retiring nodes after an update to the AMI so that all nodes are using the new AMI.
This topic describes how to temporarily prevent new workloads from being assigned to a node, as well as how to safely remove workloads from a node so that it can be permanently retired.
The kubectl cordon <node>
command prevents additional pods from being scheduled onto the node, without disrupting any of the pods currently running on it.
For example, let’s say a new node in your cluster has come up with some problems, and you want to cordon it before launching any new runs to ensure they will not land on that node.
The procedure might look like this:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-192-168-0-221.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5
ip-192-168-17-8.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5
ip-192-168-24-46.us-east-2.compute.internal Ready <none> 51m v1.14.7-eks-1861c5
ip-192-168-3-110.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5
$ kubectl cordon ip-192-168-24-46.us-east-2.compute.internal
node/ip-192-168-24-46.us-east-2.compute.internal cordoned
$ kubectl get no
NAME STATUS ROLES AGE VERSION
ip-192-168-0-221.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5
ip-192-168-17-8.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5
ip-192-168-24-46.us-east-2.compute.internal Ready,SchedulingDisabled <none> 53m v1.14.7-eks-1861c5
ip-192-168-3-110.us-east-2.compute.internal Ready <none> 12d v1.14.7-eks-1861c5
Notice the SchedulingDisabled
status on the cordoned node.
You can undo this and return the node to service with the command:
kubectl cordon <node>
-
Identify user workloads
Before removing a node from service permanently, you must ensure there are no workloads still running on it that must not be disrupted. For example, you might see the following workloads running on a node (notice the specification of the compute namespace with
-n
and wide output to include the node hosting the pod with-o
):$ kubectl get po -n domino-compute -o wide | grep ip-192-168-24-46.us-east-2.compute.internal run-5e66acf26437fe0008ca1a88-f95mk 2/2 Running 0 23m 192.168.4.206 ip-192-168-24-46.us-east-2.compute.internal <none> <none> run-5e66ad066437fe0008ca1a8f-629p9 3/3 Running 0 24m 192.168.28.87 ip-192-168-24-46.us-east-2.compute.internal <none> <none> run-5e66b65e9c330f0008f70ab8-85f4f5f58c-m46j7 3/3 Running 0 51m 192.168.23.128 ip-192-168-24-46.us-east-2.compute.internal <none> <none> model-5e66ad4a9c330f0008f709e4-86bd9597b7-59fd9 2/2 Running 0 54m 192.168.28.1 ip-192-168-24-46.us-east-2.compute.internal <none> <none> domino-build-5e67c9299c330f0008f70ad1 1/1 Running 0 3s 192.168.13.131 ip-192-168-24-46.us-east-2.compute.internal <none> <none>
Different types of workloads must be treated differently.
-
To see the details of a specific workload, run the following command:
kubectl describe po run-5e66acf26437fe0008ca1a88-f95mk -n domino-compute
The labels section of the describe output is particularly useful to distinguish the type of workload, as each of the workloads named as
run-…
will have a label likedominodatalab.com/workload-type=<type of workload>.
.The previous example contains one each of the major user workloads:
-
run-5e66acf26437fe0008ca1a88-f95mk
is a Job, with labeldominodatalab.com/workload-type=Batch
. It will stop running on its own once it is finished and disappear from the list of active workloads. -
run-5e66ad066437fe0008ca1a8f-629p9
, is a Workspace, with labeldominodatalab.com/workload-type=Workspace
. It will keep running until the user who launched it shut it down. You can contact users to shut down their workspaces, waiting a day or two for them to shut them down, or remove the node with the workspaces still running.CautionThe last option is not recommended unless you are certain there is no un-synced work in any of the workspaces and have communicated with the users about the interruption. -
run-5e66b65e9c330f0008f70ab8-85f4f5f58c-m46j7
, is an App, with the labeldominodatalab.com/workload-type=App
. It is a long-running process, and is governed by a kubernetes deployment. It will be recreated automatically if you destroy the node hosting it, but will experience whatever downtime is required for a new pod to be created and scheduled on another node. See below for methods to proactively move the pod and reduce downtime.--------------- -
model-5e66ad4a9c330f0008f709e4-86bd9597b7-59fd9
, is a Model API. It does not have adominodatalab.com/workload-type
label, and instead is easily identifiable by the pod name. It is also a long-running process, similar to an app, with similar concerns. See below for methods to proactively move the pod and reduce downtime.--------------- -
domino-build-5e67c9299c330f0008f70ad1
is an Environment. It will finish on its own and go into aCompleted
state.
-