New compute environment or compute environment updates are carried out by the Hephaestus build pods inside the domino-compute namespace. Inspecting the build pod status and describing the pods for any issues are the first steps of troubleshooting.
<user-id>$ kubectl get pods -n domino-compute | grep -i build
hephaestus-buildkit-0 1/1 Running 0 2m59s
The following shows a failure of a compute environment revision.
The build logs will give you the reason for the failure.
The following sections provide useful steps to troubleshoot: - Basic Domino health - Connectivity and latency - Workspaces and Jobs issues - Model API issues - Distributed model monitoring issues - Data sources issues