Domino 5.7.0 (August 2023)

Validated frameworks

The following versions have been validated with Domino 5.7.0. Other versions might be compatible but are not guaranteed.

New features

AutoML with Domino Code Assist

Domino Code Assist now uses AutoML to automatically generate experiment training code to help you find the best model for your dataset. You can quickly prototype accurate machine learning models with just a few clicks seamlessly integrated into your Workspace.

Model registry (preview)

Models trained in Domino and tracked with Domino experiment management can be registered to the Domino model registry, which provides a project-scoped and deployment-scoped catalog of models. With model registry, Domino automatically creates a model card with a customizable description, lineage tracking, model deployment, and version tracking. Enable the feature flag ShortLived.ModelRegistry to view the model registry in your Domino instance.

See Manage and govern models for details.

Dataset file viewer

Now you can preview files stored in Domino datasets directly in the Data page. Click any filename in a dataset to see the file rendered in the web interface.

Restricted environments and projects

Admins can configure restricted compute environments to prevent the use of unauthorized libraries and packages for sensitive workloads that require additional oversight.

Storage threshold limits

Administrators can set global and per-user quotas on dataset storage for more control over storage-related expenses. See Set limits on dataset usage.

Customizable HTML banners

Create a custom HTML banner for every page on Domino to notify users of important announcements and information. See White labeling for more information.

FinOps (preview)

FinOps is now integrated with Domino so you can manage and monitor cloud infrastructure spend for cost savings and budgeting.

Toolkit report scheduling

Access control for model monitoring

Access to a model monitored in Domino can now be limited to explicitly added collaborators. See Configure model permissions for more details.

New connector for SAS

Now you can access Domino Data Sources from SAS Workspaces. See Access Data Sources from SAS Workspaces for more details, including a code snippet you can copy to get started.

Improvements

  • Web UI improvements:

    • In Domino Nexus deployments, the Model APIs page now displays the data plane in which each Model API resides.

    • A variety of minor UI bugs are fixed and performance is improved.

  • ImageBuilder supports Azure Workload Identity as a cloud registry authentication mechanism.

  • The manage-users Keycloak role is no longer needed to activate/deactivate users from Domino.

  • Enhanced Model Monitoring API Reference, adding previously undocumented functionality and improving descriptions of existing functionality.

  • You can now publish a Model API with the model’s registered name and model version using the Model API Management API.

API changes

  • New Cost API endpoints let you programmatically retrieve cost data from Kubecost.

  • New Model Registry API endpoints let you programmatically register models to the model registry. This feature is a public preview.

Bug fixes

  • Upgrading Domino on Azure AFS from a version below 5.5.0 to version 5.7.0 will now succeed if the Keycloak script providers were previously configured on your instance.

  • Opening the Domino login URL in multiple browser tabs no longer causes login issues.

  • When creating Model APIs, Domino no longer runs pip install requests==2.19.0 and pip install prometheus-client==0.11.0 during the model building phase. This fixes previous issues where these commands could fail or downgrade the requests and urllib3 libraries in the image, leading to run-time errors in Model APIs. If you are deploying a new Model API, or a new version of a Model API, using a custom environment (rather than a Domino Standard Environment), see our instructions for customizing images when configuring your environment.

  • If the nucleus-dispatcher Kubernetes pod is restarted (during Domino upgrade, after restarting Nucleus services via the admin central configuration page, after the previous pod crashes, or for some other reason), then existing executions (including Workspaces and Jobs) continue to run.

  • Users can see raw files whose size is ⇐ 5 MB (com.cerebro.domino.frontend.defaultMaxFileSizeToRenderInBytes) when they click on the "View Latest Raw File" button in the code file browser, even if their S3 buckets don’t have CORS enabled.

Known issues

  • The Model API Management API documentation is missing 3 new attributes for the endpoint Publish a new model. You can publish a Model API using the registered name and model version by using the following attributes:

    • modelSource: Can be either File (default) or Registry. Use Registry for models registered in MLflow.

    • registeredModelName: The name of the model registered in MLflow (required if modelSource is set to Registry).

    • registeredModelVersion: The model’s version registered in MLflow (required if modelSource is set to Registry).

  • In Azure Blob Store deployments, projects with many files may fail to sync through the Domino CLI. To work around this issue, do not disable file locking when prompted by Domino.

  • You cannot view the latest raw file. In the navigation pane, go to Files and click a file to view its details. If you click View Latest Raw File, a blank page opens.

  • When uploading a large file to the Azure blob store by syncing a Workspace, you may encounter a Java Out of Memory error from Azure if the file/blob already exists. To work around this issue, use the Domino CLI to upload the file to the project.

  • Model Monitoring Data Sources aren’t validated. If you enter an invalid bucket name and attempt to save, the entry will go through. However, you won’t be able to see metrics for that entry because the name points to an invalid bucket.

  • Domino instances that make use of Azure Blob Storage may experience stalled Jobs within projects with many large files.

  • If you attach a Git repository to a DFS project that points to a tagged release, the tag won’t be honored when building a Model API in that project. The build log will show an error similar to the following, and the model will be built using the default branch of your Git repository instead of the tagged branch:

    Jul 05 2023 14:36:27 -0500 #10 6.481 WARN [d.r.d.GitRepoUpdater] could not parse ref: v1.3.0 checking out default branch correlationId="iA2qWrYSLQ" thread="main"

    To work around this issue, use the branch name when building Model APIs instead of the release tag.

  • If an admin resets a user’s password, it invalidates all the user’s authentication tokens, including tokens used for long-running tasks like Jobs, Workspaces, or Apps. The user must create a new password, log back into Domino, and restart all executions. This also applies to CLI authentication; the user must re-login to their Domino CLI.

  • In Domino 5.6, the cost analyzer pod (inactive unless Kubecost is enabled) defaults to a different storageClass compared to Domino 5.7. As a result, the pod won’t run after upgrading to 5.7, breaking Kubecost functionality. However, data will continue to persist in Prometheus (or custom storage if using Kubecost Enterprise).

    To prevent this issue while still in Domino 5.6, override the default storageClass gp2 with the one expected in 5.7, dominodisk, during Kubecost installation by setting release_overrides.cost-analyzer.chart_values.persistentVolume.storageClass to dominodisk in the agent yaml before installing Kubecost.

    If you’ve already installed Kubecost on Domino 5.6, avoid the upgrade error by setting release_overrides.cost-analyzer.chart_values.persistentVolume.storageClass to gp2 in the agent YAML configuration file before upgrading to 5.7.

  • Rename dataset’s file button is not available when the user navigates to the dataset from global dataset page.

    To work around this issue, navigate to the dataset from the project’s page.

  • The sample script for making asynchronous Model API requests contains an extra / at the end of the DOMINO_URL variable. As a result, running the script will show an error similar to the following.

    {'requestId': 'key not found: HandlerDef', 'errors': ['java.util.NoSuchElementException: key not found: HandlerDef']}

    To work around this issue, remove the trailing / at the end of the DOMINO_URL variable.

  • Links to Stack Trace and CPU Flame Graph from the Cluster tab in the Ray Cluster UI don’t work due to an upstream issue with Ray not supporting these links when hosted behind a reverse proxy. This issue is isolated to the Cluster tab, and these links work from other tabs in the same dashboard. This issue is tracked with Ray here and will likely be resolved in a future release of Ray.

  • If the authentication via SSO is enabled, but the external role synchronization is disabled, the user roles could be lost when the user logs in via the SSO.

  • When creating new users through the Add User button in the Admin view, the user’s email address is used as the username. This creates errors when trying to view that user’s projects because the username isn’t encoded in the URL.

    To work around this issue, always use Keycloak to manage users.

  • Data values returned from querying the following Data Sources are only of type byte array. This issue was fixed in Domino 5.7.2.

    • MongoDB

    • Palantir

    • Tabular S3 with AWS Glue

    • Teradata

    • Trino

    Programmatic type casting is necessary to convert values to the desired types.

  • The Status, Active Version, and Owner columns do not appear in the Model API list. This issue is fixed in Domino 5.8.0.

  • Deleting all R variables from memory using rm(list = ls(al = TRUE)) also deletes variables that Domino uses for internal processes. To safely delete variables, use rm(list = ls(all = TRUE)[!grepl("^.domino", ls(all = TRUE))]) instead."

  • When restarting a Workspace through the Update Settings modal, External Data Volumes are not mounted in the new Workspace. Follow the steps to mount External Data Volumes. This issue is fixed in Domino 5.9.0.

  • Downloading single files from Datasets will fail if the filename contains special characters, including + and &. As a workaround, remove the mentioned special characters by renaming the file. This issue is fixed in Domino 5.10.0.

  • Spaces in ADLS filenames are not allowed when getting and putting objects in Azure Data Sources with DominoDataR. As a workaround, upgrade to DominoDataR version 0.2.4. This issue is fixed in Domino 5.10.0.

  • Viewing dataset files in an Azure-based Domino cluster may lock files, preventing them from being deleted or modified. Restarting Nucleus frontend pods will release the lock. This issue is fixed in Domino 5.11.1.

Upgrade notes

  • GKE users that provisioned their infrastructure with Domino’s terraform-gcp-gke module must apply the changes introduced for 5.7.0 as of terraform-gcp-gke v2.5.0 when upgrading to ensure firewall rules work properly.

  • VPN support from within executions was updated to be disabled by default. Support can be enabled by setting the global config value com.cerebro.domino.computegrid.executions.allowVpn = true.

  • MongoDB is no longer the authoritative source of truth for User Roles. Keycloak has taken over the role. User Groups in Keycloak now correspond to Domino Global Roles, and a user’s membership status in these groups defines their Domino roles. The Central Config key authentication.oidc.externalRolesEnabled has been retired and no longer has any effect. Any edits made to roles in MongoDB will be overridden by the data from Keycloak.

  • Domino CLI clients version 1.x (released in 2017 or earlier) are no longer supported. It is recommended to upgrade to Domino CLI version 6.0.