Domino Data Importer¶
Overview¶
The Data Importer is a containerized tool that can be used to load data from a source Domino installation into a different target Domino installation. It can be used to load backups into a recovery environment, or to migrate data from one environment to another. The Data Importer is capable of loading all of the critical data stores that are automatically backed up by Domino:
- Projects
- Logs
- Docker registry
- Datasets
- MongoDB
- Domino Git
- Postgres
The Data Importer also has functionality to perform necessary transformations and schema changes when migrating data across different Domino versions.
For migrations, the Data Importer will do incremental transfers. It can be run multiple times on the same source and target installations, and will synchronize them.
Configuration¶
The Data Importer is controlled by a YAML configuration file called importer.yaml
. See the bottom of this page for a
complete example, and see below for detailed schema information.
Throughout this document, references to “source Domino” will mean the Domino installation that the data originated from, and “target Domino” is the installation you want to load the data into.
domino¶
The domino
object contains configuration and credential information for the target Domino.
Key | Type | Example | Description |
---|---|---|---|
install_configmap |
String | fleetcommand-agent-config |
Name of the Kubernetes configmap that stores the target Domino installer configuration. This will exist in the default namespace. |
install_secret |
String | credential-store-domino-platform |
Name of the Kubernetes secret that stores credentials for the target Domino. This will exist in the system namespace. |
system_namespace |
String | domino-system |
Name of the Kubernetes namespace that contains Domino system resources. This name may be domino-system or $NAME-system . |
configData¶
The configData
object describes connection information to the source Domino, and specifies the method for import
operations.
configData.remote_ssh¶
The remote_ssh
object defines how the Data Importer will connect to the source Domino. This is only necessary for
legacy Domino installs (Domino version < 4.0). The values used here will be used for all migrations that need remote SSH
unless an explicit override is added to the migration configuration.
Key | Type | Example | Description |
---|---|---|---|
bastion_host |
String hostname | domino-bastion.domain.com |
Hostname of the bastion host in the source Domino. This is a machine from which you can SSH to the rest of the Domino infrastructure. |
bastion_user |
String | ubuntu |
User on the bastion_host that can be authenticated via SSH. |
ssh_host |
String hostname | domino-central.domain.com |
Hostname of the central host in the source Domino. This is where the Domino central server is running. |
ssh_user |
String | ubuntu |
User on the ssh_host that can be authenticated vis SSH. |
ssh_key_path |
String file path | /opt/sshkeys/domino |
Filesystem path to an SSH key provided in configData.sshKeys that can be used to SSH to the bastion_host and ssh_host . |
configData.migrations¶
The migrations
object should contain a list of migration objects that describe the data migrations to execute. These
objects are treated as an _ordered_ list of migrations to perform. The services should be migrated in the following
order:
k8s_secrets
(Not required for legacy migration)mongo
postgres
(Not required for legacy migration)git
logjam
blobs
registry
datasets
See the bottom of this page for complete examples.
Key | Type | Example | Description |
---|---|---|---|
method |
Domino migration method | mongo |
For each migration, specify one of the Domino migration methods. |
name |
String | mongo |
Name for the migration. |
service |
String | mongo |
One of mongo , git , datasets , logjam , registry , blobs , k8s_secrets , postgres |
config |
Object with method configurations | - migrate_legacy_users: true - reset_keycloak: true |
Configuration object specific to the chosen method. |
configData.sshKeys¶
The sshKeys
object is used to provide the Data Importer with SSH keys for connecting to hosts that have data to import.
When migrating from older Domino versions, it is typically necessary to supply an SSH key for the source Domino’s
bastion server.
A YAML key with a given $NAME
in this object will have a corresponding file written to opt/sshkeys/$NAME
inside
the Data Importer container with the string literal contents provided. When configuring a method that requires an
ssh_host
and ssh_user
, you should supply the opt/sshkeys/$NAME
path that points to a file containing the
correct key for the target user and host.
Key | Type | Example | Description |
---|---|---|---|
$NAME |
String literal RSA private key | | -----BEGIN RSA PRIVATE KEY----- $SSH_KEY_DATA -----END RSA PRIVATE KEY----- |
Each of these objects produces a file at opt/sshkeys/$NAME inside the Data Importer container with the supplied key contents. |
Import methods¶
mongo¶
The mongo
method uses mongorestore to load
MongoDB data into the target Domino. This data can be retrieved from a legacy deployment via SSH to the central server
via bastion, or the data can be loaded from a local MongoDB backup in a .tar
archive, like the ones produced by
Domino automated backups.
The default mode is to connect to the ssh_host
, dump MongoDB data, then transfer it into the container and
automatically set backup_path
to point to it. If you already have a .tar
backup of MongoDB data, you can mount
or pull it into the Data Importer container in /opt/scratch
and provide a path to it in backup_path
. If there is
a user-defined value in backup_path
, the SSH transfer is skipped and the file at the user-defined path is used.
This process always excludes central configuration and feature flag collections.
Configuration options
Key | Default | Description |
---|---|---|
bastion_host |
Defaults to the value of configData.remote_ssh.bastion_host |
Overrides the bastion_host value in configData.remote_ssh for just this method. |
bastion_user |
Defaults to the value of configData.remote_ssh.bastion_user |
User on the bastion_host that can be authenticated via SSH. |
ssh_host |
Defaults to the value of configData.remote_ssh.ssh_host |
Hostname of the central host in the source Domino. |
ssh_user |
Defaults to the value of configData.remote_ssh.ssh_user |
Overrides the ssh_user value in configData.remote_ssh for just this method. |
ssh_key_path |
Defaults to the value of configData.remote_ssh.ssh_key_path |
Filesystem path to an SSH key provided in configData.sshKeys that can be used to SSH to the bastion_host and ssh_host . |
ssh_port |
22 |
Port to use for SSH to the ssh_host . |
backup_path |
Null | Filesystem path to a .tar archive in the Data Importer container with MongoDB backups to load. This should typically be a .tar you have pulled into the Data Importer container at /opt/scratch/ . Supplying a value for this option overrides the normal SSH mode of Mongo data retrieval. |
migrate_legacy_users |
False |
Set to True if the source Domino is running a legacy version (version < 4.0) |
reset_keycloak |
False |
If set to True , this deletes all users in the target Domino prior to migration of users from the source Domino. |
excluded_collections |
[
“domino.config”,
“domino.feature_flag_overrides”,
“domino.scheduler_locks”,
“domino.cache”
]
|
This advanced option can specify MongoDB collections to exclude from migration. By default, central connfiguration settings, feature flag settings, scheduler locks, and cache collections are excluded. |
s3_to_s3¶
This method syncs the contents of a source and destination S3 bucket. This is suitable for migrating blobs, logs, and in cases where Docker in the source Domino is backed by S3, it can also migrate registry data.
Key | Default | Description | |
---|---|---|---|
source_bucket_name |
None (Required) | deployment1-domino-project-data |
Name of the S3 bucket containing the desired data from the source Domino. A user-defined bucket name must be supplied. |
dest_bucket_name |
By default this will automatically discover and use the name of the bucket used by the chosen service in the target deployment | Name of the S3 bucket to copy data into. | |
access_key |
None | AWS access key to use for access to the buckets. This is not required if the worker node running the Data Importer has an AWS instance role that grants access to both buckets. | |
secret_key |
None | AWS secret key to use for access to the buckets. This is not required if the worker node running the Data Importer has an AWS instance role that grants access to both buckets. | |
session_token |
None | Session token where needed for AWS temporary credentials. |
disk_to_s3¶
This method syncs the contents of a filesystem directory on a remote ssh_host
machine to an S3 bucket. This is
suitable for migrating blob data stored in source on-premises Domino NFS systems to target AWS Domino S3 buckets, or for
migrating Docker registry data from legacy source deployments into target deployments that use an S3-backed Docker
registry.
Key | Default | Description |
---|---|---|
bastion_host |
Defaults to the value of configData.remote_ssh.bastion_host |
Overrides the bastion_host value in configData.remote_ssh for just this method. |
bastion_user |
Defaults to the value of configData.remote_ssh.bastion_user |
User on the bastion_host that can be authenticated via SSH. |
ssh_host |
Defaults to the value of configData.remote_ssh.ssh_host |
Hostname of the central host in the source Domino. |
ssh_user |
Defaults to the value of configData.remote_ssh.ssh_user |
Overrides the ssh_user value in configData.remote_ssh for just this method. |
ssh_key_path |
Defaults to the value of configData.remote_ssh.ssh_key_path |
Filesystem path to an SSH key provided in configData.sshKeys that can be used to SSH to the bastion_host and ssh_host . |
ssh_port |
22 |
Port to use for SSH to the ssh_host . |
remote_path |
Uses service-defined defaults | Filesystem path on the remote ssh_host containing files to copy to the dest_bucket_name . |
dest_bucket_name |
By default this will automatically discover and use the name of the bucket used by the chosen service in the target deployment | Name of the S3 bucket to copy data into. |
iam_access_key_id |
None | AWS access key to use for access to the buckets. This is not required if the worker node running the Data Importer has an AWS instance role that grants access to both buckets. |
iam_secret_key |
None | AWS secret key to use for access to the buckets. This is not required if the worker node running the Data Importer has an AWS instance role that grants access to both buckets. |
iam_session_token |
None | Session token where needed for AWS temporary credentials. |
tar¶
This method extracts and loads data from a .tar
archive on a local path in the Data Importer.
Key | Default | Description |
---|---|---|
source_path |
None (Required) | Provide a path to a .tar archive with the required data for the chosen service. This must be a file system path inside the Data Importer container. |
dest_path |
Uses service-defined defaults | This is the path data from the archive in the source_path is extracted to. This will default to the correct path for the chosen service. |
rsync¶
This method transfer data from a remote host path to a local path via rsync
. Both paths will use service-defined
defaults, where the remote path with be the standard path to legacy data, and the local path will be a mounted volume
for the correct destination service.
Key | Default | Description |
---|---|---|
bastion_host |
Defaults to the value of configData.remote_ssh.bastion_host |
Overrides the bastion_host value in configData.remote_ssh for just this method. |
bastion_user |
Defaults to the value of configData.remote_ssh.bastion_user |
User on the bastion_host that can be authenticated via SSH. |
ssh_host |
Defaults to the value of configData.remote_ssh.ssh_host |
Hostname of the central host in the source Domino. |
ssh_user |
Defaults to the value of configData.remote_ssh.ssh_user |
Overrides the ssh_user value in configData.remote_ssh for just this method. |
ssh_key_path |
Defaults to the value of configData.remote_ssh.ssh_key_path |
Filesystem path to an SSH key provided in configData.sshKeys that can be used to SSH to the bastion_host and ssh_host . |
ssh_port |
22 |
Port to use for SSH to the ssh_host . |
remote_path |
Uses service-defined defaults | Filesystem path to a directory on the remote host containing source data. |
local_path |
Uses service-defined defaults | Filesystem path in the local Data Importer container to sync data to. |
backup_dir |
None | If there is any existing data in the local_path , it will be erased by the migration. If you supply a local filesystem path (/opt/scratch/$SERVICE_NAME is recommended) the local data will be backed up there prior to migration. |
postgres¶
This method ingests a local .sql
backup of Postgres data and loads it into the target Domino Postgres service.
This method is not necessary for legacy migrations, which do not have Postgres data.
Key | Default | Description |
---|---|---|
backup_path |
None (Required) | Points to local .sql backup of Postgres data that you have mounted or pulled into the Data Importer container. |
k8s_secrets¶
Key | Default | Description |
---|---|---|
backup_path |
None (Required) | Filesystem path to a backup of the credential-store-domino-platform secret from the source Domino. |
Running the importer¶
After composing an importer.yaml
, you can install the importer tool and load the configuration by running:
helm registry upgrade quay.io/domino/helm-domino-data-importer:beta --install -f ./importer.conf \
--namespace domino-platform domino-data-importer
This will deploy the helm-domino-data-importer
image into the cluster as a Kubernetes pod, and load the
configuration into the running container at /opt/config/config.yaml
.
Once the pod is running, you can attach to the pod by running:
kubectl -n stagename-platform attach -it domino-data-importer-0
Then from inside the container, run ./importer
to execute the import.
The tool will use the config provided. There will be scratch space setu p for you in /opt/scratch
, and it will
persist after the pod is deleted or if the helm chart is deleted or purged. Any custom configs should go here, as should
any files that need to be manually copied into the deployment, such as .tar
backups.
After the Data Importer finishes all migrations, you need to restart the frontend and dispatcher services in the target Domino to index the new data in relevant systems. This can be accomplished from the administration UI by clicking Advanced > Restart Services.
Example AWS migration configuration¶
domino:
install_configmap: "fleetcommand-agent-<new install stagename>"
install_secret: "credential-store-<new install stagename>-platform"
system_namespace: "<new install stagename>-system"
configData:
remote_ssh:
bastion_host: <bastion hostname>
bastion_user: <central instance username>
ssh_host: <central instance hostname>
ssh_key_path: /opt/sshkeys/domino
ssh_user: <central instance username>
migrations:
- method: mongo
name: mongo
service: mongo
config:
migrate_legacy_users: true # migrates users from mongo to keycloak
reset_keycloak: true # resets keycloak migrations
keycloak_migrations_image: quay.io/domino/keycloak-realm-migration:latest # Need to override until upgraded version gets out there
- config:
remote_path: /domino/git/projectrepos/
method: rsync
name: git
service: git
- method: rsync
name: datasets
service: datasets
- config:
source_bucket_name: <logs bucket name>
method: s3_to_s3
name: logjam
service: logjam
- config:
source_bucket_name: <blobs bucket name>
method: s3_to_s3
name: blobs
service: blobs
- method: disk_to_s3
name: registry
service: registry
sshKeys:
domino: |
-----BEGIN RSA PRIVATE KEY-----
<ssh key data here>
-----END RSA PRIVATE KEY-----
Example AWS restore configuration¶
domino:
install_configmap: "fleetcommand-agent-<new install stagename>"
install_secret: "credential-store-<new install stagename>-platform"
system_namespace: "<new install stagename>-system"
migrations:
- method: k8s_secrets
name: secrets
service: k8s_secrets
config:
backup_path: /opt/scratch/restore/k8s_secrets/secrets.yaml
- config:
migrate_legacy_users: false
reset_keycloak: false
backup_path: /opt/scratch/restore/mongo/20200307-0000.tar.gz
method: mongo
name: mongo
service: mongo
- config:
backup_path: /opt/scratch/restore/postgres/20200307-0000.sql
method: postgres
name: postgres
service: postgres
- config:
source_path: /opt/scratch/restore/git/20200307-0000.tar.gz
method: tar
name: git
service: git
- config:
source_bucket_name: stagename-log-snaps
method: s3_to_s3
name: logjam
service: logjam
- config:
source_bucket_name: stagename-blobs
method: s3_to_s3
name: blobs
service: blobs
- config:
source_bucket_name: stagename-docker-registry
method: s3_to_s3
name: registry
service: registry
- method: rsync
name: datasets
service: datasets
config:
bastion_host: 1.2.3.4 # optional bastion host
bastion_user: bastion_user
ssh_host: 5.6.7.8 # host with datasets nfs mounted
ssh_key_path: /opt/sshkeys/domino
ssh_user: host_user
remote_path: '/efs/datasets/mount/root' # /filecache should be inside this directory