domino logo
Tech Ecosystem
Get started with Python
Step 0: Orient yourself to DominoStep 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Get started with R
Step 0: Orient yourself to Domino (R Tutorial)Step 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Get Started with MATLAB
Step 1: Orient yourself to DominoStep 2: Create a Domino ProjectStep 3: Configure Your Domino ProjectStep 4: Start a MATLAB WorkspaceStep 5: Fetch and Save Your DataStep 6: Develop Your ModelStep 7: Clean Up Your Workspace
Step 8: Deploy Your Model
Scheduled JobsLaunchers
Step 9: Working with Domino Datasets
Domino Reference
Projects
Projects Overview
Revert Projects and Files
Revert a ProjectRevert a File
Projects PortfolioProject Goals in Domino 4+Jira Integration in DominoUpload Files to Domino using your BrowserCopy ProjectsFork and Merge ProjectsSearchSharing and CollaborationCommentsDomino Service FilesystemCompare File RevisionsArchive a Project
Advanced Project Settings
Project DependenciesProject TagsRename a ProjectSet up your Project to Ignore FilesUpload files larger than 550MBExporting Files as a Python or R PackageTransfer Project Ownership
Domino Runs
JobsDiagnostic Statistics with dominostats.jsonNotificationsResultsRun Comparison
Advanced Options for Domino Runs
Run StatesDomino Environment VariablesEnvironment Variables for Secure Credential StorageUse Apache Airflow with Domino
Scheduled Jobs
Domino Workspaces
WorkspacesUse Git in Your WorkspaceUse Visual Studio Code in Domino WorkspacesPersist RStudio PreferencesAccess Multiple Hosted Applications in one Workspace Session
Spark on Domino
On-Demand Spark
On-Demand Spark OverviewValidated Spark VersionConfigure PrerequisitesWork with your ClusterManage DependenciesWork with Data
External Hadoop and Spark
Hadoop and Spark OverviewConnect to a Cloudera CDH5 cluster from DominoConnect to a Hortonworks cluster from DominoConnect to a MapR cluster from DominoConnect to an Amazon EMR cluster from DominoRun Local Spark on a Domino ExecutorUse PySpark in Jupyter WorkspacesKerberos Authentication
On-Demand Ray
On-Demand Ray OverviewValidated Ray VersionConfigure PrerequisitesWork with your ClusterManage DependenciesWork with Data
Customize the Domino Software Environment
Environment ManagementDomino Standard EnvironmentsInstall Packages and DependenciesAdd Workspace IDEs
Partner Environments for Domino
Use MATLAB as a WorkspaceCreate a SAS Data Science Workspace EnvironmentNVIDIA NGC Containers
Advanced Options for Domino Software Environment
Install Custom Packages in Domino with Git IntegrationAdd Custom DNS Servers to Your Domino EnvironmentConfigure a Compute Environment to User Private Cran/Conda/PyPi MirrorsScala notebooksUse TensorBoard in Jupyter Workspaces
Publish your Work
Publish a Model API
Model Publishing OverviewModel Invocation SettingsModel Access and CollaborationModel Deployment ConfigurationPromote Projects to ProductionExport Model Image
Publish a Web Application
App Publishing OverviewGet Started with DashGet Started with ShinyGet Started with FlaskContent Security Policies for Web Apps
Advanced Web Application Settings in Domino
App Scaling and PerformanceHost HTML Pages from DominoHow to Get the Domino Username of an App Viewer
Launchers
Launchers OverviewAdvanced Launcher Editor
Assets Portfolio Overview
Connect to your Data
Data in Domino
Datasets OverviewDatasets Best Practices
Data Sources Overview
Connect to Data Sources
External Data Volumes
Git and Domino
Git Repositories in DominoWork From a Commit ID in Git
Work with Data Best Practices
Work with Big Data in DominoWork with Lots of FilesMove Data Over a Network
Advanced User Configuration Settings
User API KeysDomino TokenOrganizations Overview
Use the Domino Command Line Interface (CLI)
Install the Domino Command Line (CLI)Domino CLI ReferenceDownload Files with the CLIForce-Restore a Local ProjectMove a Project Between Domino DeploymentsUse the Domino CLI Behind a Proxy
Browser Support
Get Help with Domino
Additional ResourcesGet Domino VersionContact Domino Technical SupportSupport Bundles
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining

Git-based Projects with CodeSync

Note

Git-based projects provide a full CodeSync experience for your code by using Git and a Git service provider of your choice. Integrated CodeSync technology ensures that all the common Git workflows, like committing, pushing changes, and more, are available to you natively within workspaces launched in Git-based projects. This makes it easy for you to engage in version controlled, code-based collaboration with fellow project team members, all from within Domino. Git-based projects also organize your projects' assets as either Code, Data, or Artifacts, an organizational structure intended to support common data science workflows.

If you want to use a private Git repository to store your code, then you must add the corresponding Git credentials in your Domino account settings prior to creating your project. After adding credentials, you’ll be able to easily create a Git-based project in Domino, thereby enabling CodeSync experience.

If you want to use a public Git repository to store your code, then you won’t need to add any Git credentials.

Create a Git-based project:
  1. Click Projects in the navigation pane and then click New Project.

    gbp-2

  2. In the Create New Project window, enter a name for your project.

  3. Set your project’s visibility.

  4. Click Next to go to the wizard’s next page.

    gbp-13

  5. Under Hosted By, click Git Service Provider. On selection, more fields will be shown beneath the Hosted By field.

  6. Under Git Service Provider, select the provider currently hosting the repository you wish to import (the "target repository").

  7. Under Git Credentials, select credentials authorized to access the target repository.

  8. Under Repository, you can either choose a repository from a list or enter a Git URL. If you are using a PAT credential with Github or GitLab, you can also create your own repository.

  9. Click Create

    gbp-14

    Important

    If the repository you’re using to store your code contains one or more files exceeding 2 GB in size, Domino will create your Git-based project, but you might fail at workspace setup. Consider using Domino File System in that case.

    You can use the following tool to check the total size of a Git repository, as well as the size of individual files within the repository: git sizer.

  • During the project creation process, you can create a completely new repository for Github and GitLab. (These are the only Git providers currently supported by Domino.)

Create a new repository:
  1. Select the Create new repository option under Repository.

  2. Select the Owner/Organization associated with the repository, its visibility, and specify the name for the new repository.

Code, Data, & Artifacts

In Domino, the Domino File System (DFS) is the traditional way of storing a project’s assets. DFS-based projects organize all of your project’s assets as either Data or Files. Git-based projects, however, organize your project’s assets as either Code, Data, or Artifacts, and apply CodeSync experience to Code assets.

Code – This section of your Git-based project organizes and lists all of the Git-based repositories used to store your project’s code, as well as any additional imported repositories. For more information, see Git repositories in Domino. Files within any of these repositories can be accessed from within a Domino workspace via CodeSync technology.

The common Git workflows, like committing, pushing, pulling, and more, are available to you when interacting with your code from within a Domino workspace. For more information, see Using Git in your workspace.

Note

You can select from any branch and the latest 10 commits using drop-down lists, then browse the directories of linked Git repositories natively from the Code section on project page.

Note

gbp-15

Data

Similar to DFS projects, this section of your Git-based project organizes and lists all data sources used in your project, including Domino datasets, external data volumes, and dataset scratch spaces. For more information about how to use data with your project, see the Domino datasets documentation.

Artifacts

Git-based projects introduce “Artifacts”. Artifacts are typically results or products from your research and analysis, like plots, charts, serialized models, and more. You can organize these outputs in this section, as well as import artifacts from other projects.

gbp-1

Develop models in a workspace

A Domino workspace is an interactive session where you can conduct research, analyze data, train models, and more.

Note

Directory structure

Git-based projects with CodeSync use a different directory structure in workspaces than DFS-based projects. The directory structure is shown below.

The default working directory for your code is /mnt/code.

/mnt
│
├── /code   # Git repository and default working directory
│
├── /data
│   │   # Project Datasets
│   ├── /{dataset-name}   # latest version of dataset
│
│    # Project Artifacts
├── /artifacts
│
│    # External mounted volumes
├── /{external-volume-name}
│
└── /imported
    │   # Imported Git Repos
    ├── /code
    │   └── /{imported-repo-name}
    │
    ├── /data
    │   │   # Mounted Shared Datasets
    │   └── /{shared-dataset-name} # contains contents of latest snapshot unless otherwise specified by yaml
    │
    │    # Imported Project Artifacts
    └── /artifacts
        └── /{imported-project-name}
Important

Work with artifacts in your workspace

Important
All files in Artifacts are saved exclusively to the Domino File System (DFS). If you do not want to save a particular asset to the Domino File System, we recommend that you do not save it as an artifact. To learn more, see Syncing your work to Domino.

Artifacts are results from your research, like plots, charts, serialized models, and more. In Domino, you can save these results in the Artifacts section of your project.

Saving artifacts and pushing changes

  1. Click File Changes in the navigation pane of your workspace.

  2. Under Artifacts, view changes by expanding File Changes.

  3. Enter a commit message.

  4. Click Sync to Domino. Domino will save your artifacts to the Domino File System (DFS).

    gbp-8

Pull changes

Pull the latest artifacts (from the Domino File System) into your workspace:
  1. Click the File Changes option in the sidebar menu of your workspace.

  2. Under the “Artifacts” section, click Pull. Domino will pull the latest changes into your workspace.

    gbp-9

Run jobs

Warning
If you run a job in a Git-based project, CodeSync ensures that only artifacts will be automatically synced and saved to the Domino File System (DFS). Code, on the other hand, will not be automatically synced/pushed to the Git repository being used for the Git-based project. This is intentional and intended to support the "Code", "Data", and "Artifacts" workflow. To learn more, see running jobs.
Domino Data LabKnowledge BaseData Science BlogTraining
Copyright © 2022 Domino Data Lab. All rights reserved.