See also the fleetcommand-agent Release Notes.
The following versions have been validated with Domino 5.8.0. Other versions might be compatible but are not guaranteed.
-
Kubernetes - see the Kubernetes compatibility chart
-
Ray - 2.4.0
-
Spark - 3.4.0
-
Dask - 2023.6.0
-
MPI - 4.1.4
AI Hub (preview)
The new AI Hub enables quickly building AI applications from prebuilt solutions curated from the best of open source that are enterprise-ready. The AI Hub enables you and your team to discover and reuse templates for several common ML use cases and industry-specific patterns while providing access to best practices and “art of the possible” inspiration with Domino.
FinOps GA
Gain insight and control over your cloud infrastructure costs with FinOps.
-
Track - FinOps can integrate directly with cloud provider billing APIs, so you can track your actual cloud bills, including any discounts or special agreements you may have.
-
Aggregate - Group costs by important dimensions like user, project, organization, and hardware tiers to identify cost contributors and allocate expenses.
-
Control - Set budgets and alerts for Projects and Organizations to proactively control costs.
Feature store GA
Domino’s feature store is now Generally Available (GA). The centralized feature store lets you store, catalog, search, share, and re-use features across your organization, enabling you to develop and deploy your models faster and more consistently. Access features in both the online store for real-time inferencing, and in the offline store for model training and batch inferencing.
Model registry and governance updates
You now have more flexibility and tighter control over model governance review processes.
-
Custom review stages - Define custom model review stages to mold the review process to your needs - add stages like "Pre-production" and "Staging-2" to align with your development workflows.
-
Model discoverability settings - Decide who in your organization gets to see your models and protect sensitive information with enhanced discoverability settings.
-
Model activity log - Audit model review activity with a new log containing model stage transitions, review requests, and review responses.
Data Source support for Databricks, Trino, and Starburst JDBC
Access more of your data with new Data Source connectors for the following sources:
Connect to any Starburst-supported JDBC data entities including:
-
ClickHouse
-
Druid
-
Greenplum
-
MariaDB
-
Ignite
-
SingleStore (MemSQL)
-
Synapse
-
Vertica
-
Generic JDBC connectivity
-
It’s now easier than ever to see who accessed Data Sources and when they accessed them with audit log information now available from the model card, Workspaces, Jobs, and Experiments results pages.
-
Enhanced Grafana alerting now includes real-time alerts with links to troubleshooting runbooks to improve your MTTR (Mean Time to Resolution).
-
Real-time health notifications: Receive real-time alerts from the cluster at the onset of any anomaly or issue for even faster response times.
-
Guided troubleshooting runbooks: Alerts now come with a detailed runbook offering step-by-step solutions to quickly resolve issues.
-
Revamped Grafana dashboards: Overhauled the Grafana dashboard experience for a clearer view into the health of your Domino.
-
-
Experiment manager usability and quality of life improvements:
-
MLflow upgraded to version 2.6.0 (previously on 2.3.2).
-
Improved charting experience, including persisting customizations on the user level.
-
Search for parameters and metrics in the column selector.
-
Edit experiments and run names in the UI.
-
Archive and restore experiments in the UI.
-
Run details now include lineage to the registered model and data used.
-
-
You can now configure Zendesk support for your Domino instance during installation. See Configuration reference for more information.
-
The terraform-aws-eks module has undergone usability and stability enhancements. It now sets up infrastructure, cluster, and nodes independently, each with its own state, for more precise control, lower risk of disruptions, and faster iteration. We’ve also introduced a script to manage multiple modules, simplifying Terraform commands across sectors.
-
Keycloak pod replicas no longer share jar files as a stability enhancement. Since sharing the same file with multiple Keycloak replicas could lead to unexpected behavior, you must now upload jar files with custom JavaScript providers to each Keycloak pod replica individually. See Single sign-on (SSO) configuration for more information.
-
Hardware tier dropdown is now sorted alphabetically. It also includes keyword searching capability and provides information about the number of GPUs when relevant.
-
You can now hide the
Reset Password
button in the Admin user management page by setting the Central Configuration keycom.cerebro.domino.userManagement.passwordResetEnabled
tofalse
.
Non-root executions
You can now run user executions in Kubernetes without root privileges for increased security in Domino. To enable non-root executions, an admin should set com.cerebro.domino.computegrid.kubernetes.nonRootExecutions.enabled
to true
in Central Configuration. Admins of Domino instances using non-root executions should be aware of the following differences compared to standard instances:
-
Different requirements for environment images, see Manually create an Environment with a pre-built image for more details.
-
Privileged capabilities have been removed from containers and the execution pod spec.
-
The Domino user has been removed from the
sudoers
file; therefore the sudo command can no longer be used in Workspaces. Domino recommends building environment images with the necessary packages instead of installing them at runtime for better reproducibility and increased security. -
Mounted volumes are configured to use the well-known non-root gid
12574
, via Kubernetes fsGroup. -
The reverse-proxy container inside the run pod listens on port
8765
instead of port80
. -
Due to the non-root changes, the keys
com.cerebro.domino.computegrid.dominoGroupId
andcom.cerebro.domino.computegrid.dominoUserId
have been deprecated and removed from Central Configuration.
-
You can now programmatically set your Kubecost License key with the new Kubecost License Key API endpoint.
-
The paths under
/u/{ownerUsername}/{projectName}/scheduledruns
, that were deprecated in Domino 5.5.0, have been removed.
-
You can now see the Status, Active Version, and Owner columns in the Model API list.
-
App authors can now pass their Domino username in the header for Apps.
-
S3 buckets must have CORS enabled to use the "View Latest Raw File" button in the code file browser if the file is > 5 MB (
com.cerebro.domino.frontend.defaultMaxFileSizeToRenderInBytes
). As a workaround, use the Download button to download larger files and view them on your computer.
-
In Azure Blob Store deployments, projects with many files may fail to sync through the Domino CLI. To work around this issue, do not disable file locking when prompted by Domino.
-
You cannot view the latest raw file if you click View Latest Raw File. In the navigation pane, go to Files and click a file to view its details.
-
When uploading a large file to the Azure blob store by syncing a Workspace, you may encounter a Java Out of Memory error from Azure if the file/blob already exists. To work around this issue, use the Domino CLI to upload the file to the project.
-
Model Monitoring data sources aren’t validated. If you enter an invalid bucket name and attempt to save, the entry will go through. However, you won’t be able to see metrics for that entry because the name points to an invalid bucket.
-
Domino instances that make use of Azure Blob Storage may experience stalled Jobs within projects with many large files.
-
If you attach a Git repository to a DFS project that points to a tagged release, the tag won’t be honored when building a Model API in that project. The build log will show an error similar to the following, and the model will be built using the default branch of your Git repository instead of the tagged branch:
Jul 05 2023 14:36:27 -0500 #10 6.481 WARN [d.r.d.GitRepoUpdater] could not parse ref: v1.3.0 checking out default branch correlationId="iA2qWrYSLQ" thread="main"
To work around this issue, use the branch name when building Model APIs instead of the release tag.
-
If an admin resets a user’s password, it invalidates all the user’s authentication tokens, including tokens used for long-running tasks like Jobs, Workspaces, or Apps. The user must create a new password, log back into Domino, and restart all executions. This also applies to CLI authentication; the user must re-login to their Domino CLI.
-
In Domino 5.6, the cost analyzer pod (inactive unless Kubecost is enabled) defaults to a different
storageClass
compared to Domino 5.7. As a result, the pod won’t run after upgrading to 5.7, breaking Kubecost functionality. However, data will continue to persist in Prometheus (or custom storage if using Kubecost Enterprise).To prevent this issue while still in Domino 5.6, override the default storageClass
gp2
with the one expected in 5.7,dominodisk
, during Kubecost installation by settingrelease_overrides.cost-analyzer.chart_values.persistentVolume.storageClass
todominodisk
in the agent yaml before installing Kubecost.If you’ve already installed Kubecost on Domino 5.6, avoid the upgrade error by setting
release_overrides.cost-analyzer.chart_values.persistentVolume.storageClass
togp2
in the agent YAML configuration file before upgrading to 5.7.
-
Rename dataset’s file button is not available when the user navigates to the dataset from the global dataset page.
To work around this issue, navigate to the dataset from the project’s page.
-
The sample script for making asynchronous Model API requests contains an extra
/
at the end of theDOMINO_URL
variable. As a result, running the script will show an error similar to the following.{'requestId': 'key not found: HandlerDef', 'errors': ['java.util.NoSuchElementException: key not found: HandlerDef']}
To work around this issue, remove the trailing
/
at the end of theDOMINO_URL
variable.
-
The Jobs REST API uses
GitRefV1
to reference git objects (commits, branches, and tags). Not all examples in the API spec worked, so they’ve been updated to reflect the actual valid values. This change doesn’t affect API functionality; it’s just a fix to the documentation.
-
Links to Stack Trace and CPU Flame Graph in the Ray Cluster UI’s Cluster tab are broken due to an issue in Ray 2.4 not supporting links when hosted behind a reverse proxy. This problem is specific to the Cluster tab; links correctly function in other tabs. The issue is fixed in Ray 2.7 and will be updated in future Domino Ray image releases.
-
The Model API Management API documentation is missing 3 new attributes for the endpoint Publish a new model. You can publish a Model API using the registered name and model version by using the following attributes:
-
modelSource
: Can be eitherFile
(default) orRegistry
. UseRegistry
for models registered in MLflow. -
registeredModelName
: The name of the model registered in MLflow (required ifmodelSource
is set toRegistry
). -
registeredModelVersion
: The model’s version registered in MLflow (required ifmodelSource
is set toRegistry
).
-
-
Deleting all R variables from memory using
rm(list = ls(al = TRUE))
also deletes variables that Domino uses for internal processes. To safely delete variables, userm(list = ls(all = TRUE)[!grepl("^.domino", ls(all = TRUE))])
instead."
-
When restarting a Workspace through the Update Settings modal, External Data Volumes are not mounted in the new Workspace. Follow the steps to mount External Data Volumes. This issue is fixed in Domino 5.9.0.
-
Downloading single files from Datasets using the Download Selected Items button will fail if the filename contains special characters, including
+
and&
. As a workaround, you can download these types of files via the action menu, located to the right of the filename. This issue is fixed in Domino 5.10.0.
-
Annonymous users cannot run launchers or view public GBP projects due to the git credentials migration to vault. This issue is fixed in Domino 5.9.1.
-
Spaces in ADLS filenames are not allowed when getting and putting objects in Azure Data Sources with DominoDataR. As a workaround, upgrade to DominoDataR version 0.2.4. This issue is fixed in Domino 5.10.0.
-
Viewing dataset files in an Azure-based Domino cluster may lock files, preventing them from being deleted or modified. Restarting Nucleus frontend pods will release the lock. This issue is fixed in Domino 5.11.1.
-
GKE users that provisioned their infrastructure with Domino’s terraform-gcp-gke module must apply the changes introduced for
5.7.0
as of terraform-gcp-gke v2.5.0 when upgrading to ensure firewall rules work properly. -
VPN support from within executions was updated to be disabled by default. Support can be enabled by setting the global config value
com.cerebro.domino.computegrid.executions.allowVpn = true
.
-
MongoDB is no longer the authoritative source of truth for User Roles. Keycloak has taken over the role. User Groups in Keycloak now correspond to Domino Global Roles, and a user’s membership status in these groups defines their Domino roles. The Central Config key
authentication.oidc.externalRolesEnabled
has been retired and no longer has any effect. Any edits made to roles in MongoDB will be overridden by the data from Keycloak.
-
EKS users are recommended to update the AWS VPC CNI settings to enable
ANNOTATE_POD_IP
in order to prevent execution timeout errors when an image pull takes longer than 10 minutes. In order to bypass the validation check during an upgrade, pass--warn-only
as a command line option to the installer.