Converting legacy Data Sets to Domino Datasets¶
Overview¶
This article describes how to convert legacy Data Sets workflows to use Domino Datasets. This is a two-step process that involves moving your data into a new Domino Dataset, and then updating all projects and artifacts that consume the data to retrieve it from the new location.
Migrating data from a legacy Data Set into a Domino Dataset¶
Legacy Data Sets are semantically similar to Domino Projects. If your deployment is running a version of Domino with the new Domino Datasets feature, you can create Domino Datasets inside legacy Data Sets. This will allow for a very simple migration path for a legacy Data Set, where all of the existing data is added to a single Domino Dataset owned by the legacy Data Set, and the entire file structure is preserved.
The long term deprecation plan for legacy Data Sets is to transform them into ordinary Domino Projects, which will continue to contain and share any Domino Datasets you created in them.
To get started, you need to add a script to the contents of your legacy Data Set that can transfer all of your data into a Domino Dataset output mount. From the Files page of your legacy Data Set, click Add File:
Name the file migrate.sh
, and paste in the example command provided below.
cp -R $DOMINO_WORKING_DIR/. /domino/datasets/output/main
This example migration script copies the contents of
$DOMINO_WORKING_DIR
, which is a default Domino environment variable
that always points to the root of your project, to a Domino Dataset
output mount path. The directory named main
in the path below is
derived from the name of the Domino Dataset that will be created to
store the files from this legacy Data Set.
Click Save when finished. Your script should look like this:
Next, click Datasets from the project menu, then click Create New Dataset.
Be sure to name this Dataset to match the path to the output mount in
the migration script. If you copied the command above and added it to
your script without modification, you should name this Dataset main
.
You can supply an optional description, then click Upload Contents.
On the upload page, click to expand the Create by Running
Script section.
Double-check to make sure the listed Output Directory matches the path from your migration script, then enter the name of your script and click Start. A Job will be launched that mounts the new Dataset for output and executes your script. If the Job finishes successfully, you can return to the Datasets page from the project menu and click the name of your new Dataset to see its contents.
You now have all of the data from your legacy Data Set loaded into a Domino Dataset. This method preserves the file structure of the legacy Data Set, which is useful for the next step: updating consumers to use the new Dataset.
Updating data consumers to use the new Domino Dataset¶
Potential consumers of your legacy Data Set are those users to whom you granted Project Importer, Results Consumer, or Contributor permissions. As the project Owner, you also may have other projects consuming the contents of your legacy Data Set. This same set of permissions will grant access to your new Domino Dataset.
A project consuming data from your legacy Data Set will import it as a project dependency, and it will be visible on the Other Projects tab of the Files page.
In the example above, the global-power
project imports the
data-quick-start
legacy Data Set. The contents of
data-quick-start
are then available in global-power
Runs and
Workspaces at the path shown in the Location column. Anywhere your
code for batch Runs, scheduled Runs, or Apps refers to that path will
need to be updated to point to the new Domino Dataset.
To determine the new path and set up access to the Domino Dataset, you
need to mount the Dataset. With the consuming project open, click
Datasets from the project menu, then click Mount Shared
Dataset. The Dataset to Mount field is a dropdown menu that will
show shared Datasets you have access to. In the above example, the
main
Dataset from the data-quick-start
project will be mounted
at the latest snapshot. Select the Dataset that you migrated your data
into earlier, then click Mount.
When finished, you will see the Dataset you added listed under Shared Datasets. The Path column shows the path where the contents of the Dataset will be mounted in this project’s Runs and Workspaces.
Remember that if you used the migration script shown earlier, the file structure at that path will be identical to the file structure of the imported legacy Data Set location. All you need to do to access the same data is change the path to this new Domino Dataset mount.
Be sure to contact other users who are consuming your legacy Data Set and provide them with information about the new Domino Dataset.