Manage Domino Datasets

When users have large quantities of data, including collections of many files and large individual files, Domino recommends that users import the data using a Domino Dataset.

How data is stored in Domino Datasets

Datasets are collections of Snapshots, where each Snapshot is an immutable image of a filesystem directory from the time when the Snapshot was created. These directories are stored in a network filesystem managed by Kubernetes as a shared Persistent Volume.

To view all the Datasets in your Domino deployment, go to Admin > Data > Datasets. From here you can manage permissions, and rename and delete Datasets.

These directories are stored in a network filesystem like Amazon EFS or a local NFS, and can be attached to executions for read-only use without transferring their contents into the Domino File System. This lets users quickly start working on big data in Domino.

Each Snapshot of a Domino Dataset is an independent state, and its membership in a Dataset is an organizational convenience for working on, sharing, and permissioning related data. Domino supports running scheduled Jobs that create Snapshots, so users can write or import data into a Dataset as part of an ongoing pipeline.

You can permanently delete Dataset Snapshots. This is a two-step process to avoid data loss. Users must mark Snapshots to be deleted, then you must confirm the deletion, if appropriate. This capability makes Datasets the right choice for storing data in Domino that has regulatory requirements for expiration.

Access to data in Domino Datasets

Datasets in Domino belong to projects, and access is afforded to users who have been granted roles on the project. See Sharing and collaboration. for details.

Users can also inherit roles from membership in Domino Organizations. Domino users with some administrative system roles are granted additional access to Datasets across the Domino deployment they administer. See Roles for more information.