Data in Domino




Overview

This article describes how Domino stores and handles data that users upload, import, or create in Domino. There are two systems that store user data in Domino:

  • Domino project files
  • Domino Datasets

Additionally, Domino supports connecting to many external data stores. Users can import data from external stores into Domino, export data from Domino to external stores, or run code in Domino that reads and writes from external stores without saving data in Domino itself.




About Domino project files


How is the data in project files stored?

Work in Domino happens in projects. Every Domino project has a corresponding collection of project files. While at rest, project files are stored in a durable object storage system, referred to as the Domino Blob Store.

Domino has native support for backing the Domino Blob Store with the following cloud storage services: - Amazon S3 - Azure File Storage - Google Cloud Storage

Alternatively, the Domino Blob Store can be backed with a shared Kubernetes Persistent Volume from a compatible storage class. If desired, you can provide an NFS storage service, and Domino installation utilities can deploy the nfs-client-provisioner and configure a compatible storage class backed by the provided NFS system.


Is project file data encrypted?

Domino supports customer-supplied encryption keys for: - Amazon S3

Domino supports default encryption keys for: - Amazon S3 - Azure File Storage - Google Cloud Filestore

Domino does not provide pre-write encryption for nfs-client-provisioner volumes.


How does data get stored in project files?

When a user starts a run in Domino, the files from his or her project are fetched from the Domino Blob Store and loaded into the run in the working directory of the Domino service filesystem. When the Run finishes, or the user initiates a manual sync in an interactive Workspace session, any changes to the contents of the working directory are written back to Domino as a new revision of the project files. Domino’s versioning system tracks file-level changes and can provide rich file difference information between revisions.

Domino also has several features that provide users with easy paths to quickly initiating a file sync. The following events in Domino can trigger a file sync, and the subsequent creation of a new revision of a project’s files.

  • User uploads files from the Domino web application upload interface
  • User authors or edits a file in the Domino web application file editor
  • User syncs their local files to Domino from the Domino Command Line Interface
  • User uploads files to Domino via the Domino API
  • User executes code in a Domino Job that writes files to the working directory
  • User writes files to the working directory during an interactive Workspace session, and then initiates a manual sync or chooses to commit those files when the session finishes

By default, all revisions of project files that Domino creates are kept indefinitely, since project files are a component in the Domino Reproducibility Engine. It is always possible to return to and work with past revisions of project files, with the exception of files that have been subjected to a full delete by a system administrator.


Who can access the data in project files?

Users can read and write files to the projects they create, on which they automatically are granted an Owner role. Owners can add collaborators to their projects with the following additional roles and associated files permissions.

The permissions available to each role are described in more detail in Sharing and collaboration.

Users can also inherit roles from membership in Domino Organizations. Learn more in the Organizations overview.

Domino users with some administrative system roles are granted additional access to project files across the Domino deployment they administer. Learn more in System roles.




About Domino Datasets


How is the data in Domino Datasets stored?

When users have large quantities of data, including collections of many files and large individual files, Domino recommends storing the data in a Domino Dataset. Datasets are collections of Snapshots, where each Snapshot is an immutable image of a filesystem directory from the time when the Snapshot was created. These directories are stored in a network filesystem managed by Kubernetes as a shared Persistent Volume.

Domino has native support for backing Domino Datasets with the following cloud storage services: - Amazon EFS - Azure File Storage - Google Cloud Filestore

Alternatively, the Domino Blob Store can be backed with a shared Kubernetes Persistent Volume from a compatible storage class. If desired, you can provide an NFS storage service, and Domino installation utilities can deploy the nfs-client-provisioner and configure a compatible storage class backed by the provided NFS system.

Each Snapshot of a Domino Dataset is an independent state, and its membership in a Dataset is an organizational convenience for working on, sharing, and permissioning related data. Domino supports running scheduled Jobs that create Snapshots, enabling users to write or import data into a Dataset as part of an ongoing pipeline.

Dataset Snapshots can be permanently deleted by Domino system administrators. Snapshot deletion is designed as a two-step process to avoid data loss, where users mark Snapshots they believe can be deleted, and admins then confirm the deletion if appropriate. This permanent deletion capability makes Datasets the right choice for storing data in Domino that has regulatory requirements for expiration.


Who can access the data in Domino Datasets?

Datasets in Domino belong to projects, and access is afforded accordingly to users who have been granted roles on the containing project.

The permissions available to each role are described in more detail in Sharing and collaboration.

Users can also inherit roles from membership in Domino Organizations. Learn more in the Organizations overview.

Domino users with administrative system roles are granted additional access to Datasets across the Domino deployment they administer. Learn more in System roles.




Integrating Domino with other data stores and databases

Domino can be configured to connect to external data stores and databases. This process involves loading the required client software and drivers for the external service into a Domino environment, and loading any credentials or connection details into Domino environment variables. Users can then interact with the external service in their Runs.

Users can import data from the external service into their project files by writing the data to the working directory of the Domino service filesystem, and they can write data from the external service to Dataset Snapshots. Alternatively, it is possible to construct workflows in Domino that save no data to Domino itself, but instead pull data from an external service, do work on the data, then push it to an external service.

Learn more in the Data sources overview and read our detailed Data source connection guides.


Tracking and auditing data interactions in Domino

Domino system administrators can set up audit logs for user activity in the platform. These logs record events whenever users:

  • Create files
  • Edit files
  • Upload files
  • View files
  • Sync file changes from a Run
  • Mount Dataset Snapshots
  • Write Dataset Snapshots

This list is not exhaustive, and will expand as Domino adds new features and capabilities.

Domino administrators can contact support@dominodatalab.comfor assistance enabling, accessing, and processing these logs.