domino logo
Tech Ecosystem
Get started with Python
Step 0: Orient yourself to DominoStep 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Get started with R
Step 0: Orient yourself to Domino (R Tutorial)Step 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Domino Reference
Projects
Projects Overview
Revert Projects and Files
Revert a ProjectRevert a File
Projects PortfolioProject Goals in Domino 4+Upload Files to Domino using your BrowserFork and Merge ProjectsSearchSharing and CollaborationCommentsDomino Service FilesystemCompare File RevisionsArchive a Project
Advanced Project Settings
Project DependenciesProject TagsRename a ProjectSet up your Project to Ignore FilesUpload files larger than 550MBExporting Files as a Python or R PackageTransfer Project Ownership
Domino Runs
JobsDiagnostic Statistics with dominostats.jsonNotificationsResultsRun Comparison
Advanced Options for Domino Runs
Run StatesDomino Environment VariablesEnvironment Variables for Secure Credential StorageUse Apache Airflow with Domino
Scheduled Jobs
Domino Workspaces
WorkspacesUse Visual Studio Code in Domino WorkspacesPersist RStudio PreferencesAccess Multiple Hosted Applications in one Workspace SessionUse Domino Workspaces in Safari
Spark on Domino
On-Demand Spark
On-Demand Spark OverviewValidated Spark VersionConfigure PrerequisitesWork with your ClusterManage DependenciesWork with Data
External Hadoop and Spark
Hadoop and Spark OverviewConnect to a Cloudera CDH5 cluster from DominoConnect to a Hortonworks cluster from DominoConnect to a MapR cluster from DominoConnect to an Amazon EMR cluster from DominoRun Local Spark on a Domino ExecutorUse PySpark in Jupyter WorkspacesKerberos Authentication
Customize the Domino Software Environment
Environment ManagementDomino Standard EnvironmentsInstall Packages and DependenciesAdd Workspace IDEs
Advanced Options for Domino Software Environment
Install Custom Packages in Domino with Git IntegrationAdd Custom DNS Servers to Your Domino EnvironmentConfigure a Compute Environment to User Private Cran/Conda/PyPi MirrorsScala notebooksUse TensorBoard in Jupyter WorkspacesUse MATLAB as a WorkspaceCreate a SAS Data Science Workspace Environment
Publish your Work
Publish a Model API
Model Publishing OverviewModel Invocation SettingsModel Access and CollaborationModel Deployment ConfigurationPromote Projects to ProductionExport Model Image
Publish a Web Application
Cross-Origin Security in Domino web appsApp Publishing OverviewGet Started with DashGet Started with ShinyGet Started with Flask
Advanced Web Application Settings in Domino
App Scaling and PerformanceHost HTML Pages from DominoHow to Get the Domino Username of an App Viewer
Launchers
Launchers OverviewAdvanced Launcher Editor
Assets Portfolio Overview
Connect to your Data
Domino Datasets
Datasets OverviewDatasets Best PracticesAbout domino.yamlDatasets Advanced Mode TutorialDatasets Scratch SpacesConvert Legacy Data Sets to Domino Datasets
Data Sources OverviewConnect to Data Sources
Git and Domino
Git Repositories in DominoWork From a Commit ID in Git
Work with Data Best Practices
Work with Big Data in DominoWork with Lots of FilesMove Data Over a Network
Advanced User Configuration Settings
User API KeysOrganizations Overview
Use the Domino Command Line Interface (CLI)
Install the Domino Command Line (CLI)Domino CLI ReferenceDownload Files with the CLIForce-Restore a Local ProjectMove a Project Between Domino DeploymentsUse the Domino CLI Behind a Proxy
Browser Support
Get Help with Domino
Additional ResourcesGet Domino VersionContact Domino Technical SupportSupport Bundles
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining
User Guide
>
Domino Reference
>
Customize the Domino Software Environment
>
Advanced Options for Domino Software Environment
>
Create a SAS Data Science Workspace Environment

Create a SAS Data Science Workspace Environment

In this guide, we will walk through building a SAS Data Science Docker container image that will be integrated with Domino Data Lab. Before we dive in, we will answer a few common questions and provide additional resources.

Preparation

Before getting started, you will need a few things. Configuring and getting ready for these are outside of the scope of this guide.

Internet Access

Although you are not required to run the completed SAS Data Science container image on a Domino Data Lab environment that has Internet access, you will need Internet access to download the appropriate tools that are used to build the SAS Data Science container image.

Docker Client

We will be using a Docker CLI client to build the SAS Data Science container image. Although all of the commands shown can be copy/pasted, it is good to have some familiarity with the Docker CLI tools.

Docker Registry

A Docker Registry will be used to store the final SAS Data Science image before it can be consumed in Domino. There many options for Docker Registry providers and software. If you do not feel comfortable with setting up a Docker Registry to store the Docker images for your Domino Data Lab environment, contact your Domino Customer Success Manager (CSM) or Technical Account Manager (TAM).

Git Client

We will be checking out a SAS Container Recipes Git repository. Although there are other ways to download this repository from the Internet, Git CLI will be used in this guide.

SAS Data Science License

This installation does require that you have a valid SAS Data Science license, which is provided to you by SAS Institute Inc. As part of the license, you should have a file called SAS_Viya_deployment_data.zip that will contain all of your license information and will be used to download the appropriate software.

Comfort with Linux Command-Line Utilities

All the instructions in this guide are written for Red Hat Enterprise Linux variants. The instructions are primarily for CentOS 7, but can easily be adapted to support Red Hat Enterprise Linux, SuSE Enterprise Linux, or Oracle Linux, which are all supported by the SAS Data Science platform.

See the following page for Linux 64-bit operating systems that SAS Data Science (Viya family) supports: SAS Supported Operating Systems.

Create a SAS Data Science Docker Image

The instructions for building the base SAS Data Science image that we follow are based on the SAS Container Recipes, which is available on the GitHub webpage below. Consult the directions in the following GitHub repository for exact instructions for your situation: SAS Container Recipes.

In this guide, we will be building a SAS Data Science image with a CentOS 7 base. This will be a single Viya container instead of the full-blown Viya platform across multiple containers.

The general build instructions are as follows:

  1. Clone the GitHub repository for SAS Container Recipes

    image3Shell Command

    1

    git clone https://github.com/sassoftware/sas-container-recipes.git

    2

    cd sas-container-recipes

    3

    cp PATHTO/SAS_Viya_deployment_data.zip .

    Replace PATHTO above with the directory that contains your SAS_Viya_deployment_data.zip file.

  2. Build the SAS Viya image using the build.sh utility provided

    image6Shell Command (cont)

    4

    ./build.sh --base-image centos --base-tag 7 --type single --zip ./SAS_Viya_deployment_data.zip

At the end of this process, you should have a SAS Data Science Docker image locally.

If you run into any issues, contact your SAS Institute Inc. representative for support in resolving the issues.

Add licensed SAS software

Although it is outside the scope of this document, if you require installing any additional components like SAS/ACCESS modules or database drivers, consult with your SAS representatives. These additional components can be layered on top of your base SAS Data Science Docker image.

Integrate the SAS Data Science Docker Image with Domino

We will now switch over to the Domino GitHub repository for the SAS Data Science image build. The Domino repository contains all of the files necessary to finalize the build of the SAS Data Science container image to make it integrated with Domino.

Follow the README instructions on the Domino repository for more information about the individual files.

These are the steps you will need to follow to complete the build process:

  1. Clone the Domino GitHub repository

    image7Shell Command

    1

    git clone https://github.com/imarchenko/sas-data-science.git

    2

    cd sas-data-science

  2. Modify the Dockerfile’s FROM instruction to use the SAS Data Science image you built in the prior steps

    image9Shell Command (cont)

    3

    SASDS_DOCKER_TAG=NAME:TAG.

    4

    sed -Ei.bak "s#SASDS_DOCKER_TAG#$SASDS_DOCKER_TAG#g" Dockerfile

    Change NAME:TAG above to the Docker image tag that was created in the Creating a SAS Data Science Docker Image step.

  3. Build the Docker image

    image10Shell Command (cont)

    5

    DOMINO_SASDS_DOCKER_TAG=NAME:TAG

    6

    docker build . -t $DOMINO_SASDS_DOCKER_TAG

Change NAME:TAG above to your final Docker Registry image name and tag. This is the Docker image that will be later used inside of a Domino Compute Environment.

Test the Docker Image Locally

Before pushing the Docker image to your Docker Registry, it is a good idea to test it locally first. There are two modes to test:

Interactive (SAS Studio)

image13Shell Command

1

docker run -p 80:8888 -u domino:domino -w /mnt -v $PWD/tests:/mnt -it $DOMINO_SASDS_DOCKER_TAG /var/opt/workspaces/sasds/start

After a couple of minutes when you launch the interactive SAS Studio, you should see a message "SAS Studio is now running". This is when you can visit http://localhost/SASStudio/start.html in your web browser to test SAS Studio.

Batch

image16Shell Command

1

SAS_BATCH_PROGRAM=PROGRAM.SAS.

2

docker run -u domino:domino -w /mnt -v $PWD/tests:/mnt -it $DOMINO_SASDS_DOCKER_TAG run_sas.sh $SAS_BATCH_PROGRAM

Change PROGRAM.SAS above with your test SAS program.

Push the Domino-Integrated SAS Data Science Docker Image to a Docker Registry

The final step is to push the Domino-integrated SAS Data Science Docker image to a Docker Registry. This Docker Registry will be later used to pull the Docker image into your Domino Data Lab environment.

image17Shell Command

1

docker push $DOMINO_SASDS_DOCKER_TAG

Replace NAME:TAG with the Docker Registry tag you used in the Integrating the SAS Data Science Docker Image with Domino step.

Work with your Domino Data Lab technical account team on the best method to pull the Docker image into your Domino Data Lab environment.

Configure the SAS Data Science Compute Environment in Domino

Congratulations, you are near the end of the installation process. The last step is to configure your Compute Environment in your Domino Data Lab environment.

  1. In your Domino Data Lab environment, navigate to the Domino Compute Environments page and create a new Compute Environment.

    image6

  2. Set the "Custom Image" location to your Docker Registry image. For the Custom Image URL, use the Docker Registry image URL that you created in the Push the Domino-Integrated SAS Data Science Docker Image to a Docker Registry step.

    |image19

  3. Create a Pluggable Workspace for SAS Studio in your Compute Environment

    image22

    Pluggable Workspace

    1

    sasds:

    2

    title: "SAS Data Science"

    3

    iconUrl: "https://upload.wikimedia.org/wikipedia/commons/1/10/SAS_logo_horiz.svg"

    4

    start: [ "/var/opt/workspaces/sasds/start" ]

    5

    httpProxy:

    6

    internalPath: "/\{{ownerUsername}}/\{{projectName}}/\{{sessionPathComponent}}/\{{runId}}/start.html"

    7

    port: 8888

    8

    rewrite: false

    9

    requireSubdomain: false

    image23

  4. When you are done defining the Pluggable Workspace, click the Build button at the bottom of the Compute Environment page to finalize your SAS Data Science configuration for Domino Data Lab

Maintenance and License Updates

The easiest way to keep your SAS Data Science updated is to repeat the steps in this guide whenever a new release of SAS Data Science is available. The same process should be repeated when you need to update a license file during renewals.

Repeating this process will ensure that you are staying current with the latest version of the SAS Data Science software.

Troubleshooting

SAS Studio Timeout

By default, SAS Studio will log the user out after 30 minutes - so no further development can be done in that session, and changes not written to the filesystem cannot be saved.

The recommendation is to set timeout to a high value e.g. 24 hours.

In the SAS Data Science Compute Environment in Domino, set the following in the Dockerfile:

File: /opt/sas/viya/config/etc/sysconfig/sasstudio.conf

Setting:

export java_global_option_server_servlet_session_timeout="-Dserver.servlet.session.timeout=1440m"

This is a Spring Boot 2.0 property rather than a Studio property; use the 'm' for specifying minutes or an interval alone for seconds.

NB: this will be baked into the initial image build in future releases.

SAS Studio Tabs Lost after Session Timeout

To prevent tabs being lost after losing connection, configure the following option in Preferences.

image

Configure ODBC connections

Ensure that the LD_LIBRARY_PATH is set first, before individual ODBC libraries, as per the example below:

export SASINSIDE=/sasinside/odbc

export ODBCINI=/sasinside/odbc.ini

export ODBCINST=/sasinside/odbcinst.ini

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SASINSIDE}/lib:${SASINSIDE}/lib/snowflake_odbc/lib

export SIMBAINI=/sasinside/odbc/lib/snowflake_odbc/lib/simba.snowflake.ini

ERROR: Failed to load the Apache Parquet support extension

Errors can be generated when trying to read Parquet files if the LD_LIBRARY_PATH has not been set correctly: see Configuring ODBC connections.

Domino Data LabKnowledge BaseData Science BlogTraining
Copyright © 2022 Domino Data Lab. All rights reserved.