Domino Data Sources provide a mechanism to create and manage connection properties to a supported external data service. Data Sources can be created by both Domino administrators and users, and can be shared among collaborators. Connection properties are stored securely and there is no need to install data source specific drivers or libraries. A tightly coupled library provides a consistent access pattern for both tabular and file based data.
To learn more about data sources, see Data Source Use Cases.
One common configuration pattern is for Domino administrators to create, configure, and manage broadly used Data Sources, which are then exposed to all or a subset of users. To create a Data Source as an admin, click Create a Data Source on the Admin>Data>Data Source page.
Alternatively, individual users can also create data sources directly when they need access to a more specific Data Source than what an admin may have seeded on the deployment. To create a Data Source as a user, click Create a Data Source on the top-level Data page or on the Data page for a project. Regardless of where the creation is initiated, the resulting data sources can be used in any project by users with the appropriate permissions.
The actual configuration settings will depend on the specific Data Source type that is selected.
While permissions determine who can use a data source, the actual data service connection requires credentials to authenticate. Domino allows two options for configuring credentials:
-
Individual - each user is required to provide their own credentials before using a data source
-
Service Account - Domino administrators provide a set of credentials that will be automatically applied on behalf of users with permissions to a given data source. End users cannot access or extract the credentials.
Data sources have global scope in a Domino deployment and are accessible to any user with the appropriate permissions in any project. Users can add data sources to a project explicitly (Add a data source on project Data page) or implicitly when a data source is used directly in code from a project. This allows users to have visibility into which data sources are used in each of their projects.
When multiple users collaborate on a project, it is possible that a data source used by one user is not properly configured for another. Domino will proactively surface such problems both from the project Data page as well as from the Data tab in Domino Workspaces.
Domino notifies users when they do not have permissions. When a user sees this message, they should request access from the data source owner.
When users have access but have not configured their individual credentials, they will also see a notification and will be able to add their credentials. For a given data source, individual credentials need to be added only once.
After a data source is properly configured, the Domino Data API allows users to retrieve data using a uniform interface without having to install drivers or data source specific libraries.
There are two types of data sources - Tabular and File-based with each exposing a slightly different mechanism for retrieving data. The simplest way to get started is with the automatically generated code snippet example.
For more detailed information, see the Domino Data API reference.
When using the Domino Data API from a Domino execution, user identity verification for the purposes of enforcing Domino permissions happens automatically. The library will first attempt to use a Domino JWT token, or, if not available, a user API key.
The following is a summary of the user identity that will be used for data source access based on Domino execution type.
-
Workspaces and Jobs - user who started the execution
-
Launchers - user who started the launcher regardless of who created the launcher
-
Domino Apps - user who published the app regardless of who is accessing the app
-
Model API - no user identity
For Model APIs and other advanced use cases that require establishing a different user identity it is possible to inject an API key into an execution through an environment variable, and then use it explicitly when retrieving a data source.
For more detailed information, see Custom Authentication from the API documentation.
Domino Training Sets allow you to persist dataframes for model training and other analysis. You can store and load multiple versions of a given dataframe from a training set, allowing you to connect a model to the specific version of a dataframe that was used to train it.
The dataframe used as the basis for a Training Set can be constructed using the result of a Domino Data Source query (as described above) or through any other construction method.
In addition to storing the underlying dataframe, training sets can be used to capture additional monitoring metadata. When this additional data is present, Training Set versions will be available as sources of baseline training data when publishing model APIs in Domino.
Training sets are only scoped to the projects in which they are created. Users with project contributor permissions can create, load, and delete Training Set versions in that project.
Training sets are available as an API-only feature. For information on how to use the API, see the Domino Data API documentation, especially these topics: