Exporting Control Center data with the API¶
The Control Center interface in Domino provides many different views on deployment usage, broken down by hardware tier, project, or user. However, if you want to do a more detailed, custom analysis, it’s possible for Domino administrators to use the API to export Control Center data for examination with Domino’s data science features or external business intelligence applications.
The endpoint that serves this data is /v4/gateway/runs/getByBatchId
.
Click through to read the REST documentation on this endpoint, or see below for a detailed description plus examples.
API keys¶
To make an API call, you’ll need the API key for your account. In this case, accessing the full deployment’s Control Center data requires that you use an admin account. Once you’re logged in as an admin, click your username at bottom left, then click Account Settings.
Click API Key from the settings menu to link down to the API Key panel. Copy the displayed key and keep it handy. You’ll need it to make requests to the API.
Note that anyone bearing this key could authenticate to the Domino API as you. Treat it like a sensitive password.
Using the data gateway endpoint¶
Here’s a basic call to the data export endpoint, executed with cURL:
curl --include \
-H "X-Domino-Api-Key: <your-api-key>" \
'https://<your-domino-url>/v4/gateway/runs/getByBatchId'
By default, the endpoint starts with the oldest available run data, beginning from January 1st, 2018. Older data is not available. The command also has a default limit of 1000 runs worth of data. As written, the call above will return data on the oldest 1000 runs available.
To try out this example, fill in <your-api-key>
and
<your-domino-url>
in the command above.
The standard JSON response object you receive will have the following scheme:
{
"runs": [
{
"batchId": "string",
"runId": "string",
"title": "string",
"command": "string",
"status": "string",
"runType": "string",
"userName": "string",
"userId": "string",
"projectOwnerName": "string",
"projectOwnerId": "string",
"projectName": "string",
"projectId": "string",
"runDurationSec": 0,
"hardwareTier": "string",
"hardwareTierCostCurrency": "string",
"hardwareTierCostAmount": 0,
"queuedTime": 0,
"startTime": 0,
"endTime": 0,
"totalCostCurrency": "string",
"totalCostAmount": 0
}
],
"nextBatchId": "string"
}
Each run recorded by the Control Center gets a batchId
, which is an
incrementing field that can be used as a cursor to fetch data in
multiple batches. You can see in the response above, after the array of
runs
objects there is a nextBatchId
parameter that points to the
next run that would have been included.
You can use that ID as a query parameter in a subsequent request to get the next batch:
curl --include \
-H "X-Domino-Api-Key: <your-api-key>" \
'https://<your-domino-url>/v4/gateway/runs/getByBatchId?batchId=<your-batchId-here>'
You can also request the data as CSV by including a header with
Accept: text/csv
. On the Unix shell, you can write the response to a
file with the >
operator. This is a quick way to get data suitable
for import into analysis tools:
curl --include \
-H "X-Domino-Api-Key: <your-api-key>" \
-H 'Accept: text/csv' \
'https://<your-domino-url>/v4/gateway/runs/getByBatchId' > your_file.csv
Example: Getting all data¶
The below code shows a simple Python script that fetches all Control Center data from the earliest available to a configurable end date, and writes it to a CSV file. Fill in the date of the last known completed run to fetch all available historical data.
import requests
import json
import pandas as pd
import os
from datetime import datetime
from datetime import timedelta
URL = "https://<your-domino-url>/v4/gateway/runs/getByBatchId"
headers = {'X-Domino-Api-Key': '<your-api-key>'}
last_date = 'YYYY-MM-DD'
last_date = datetime.strftime(datetime.strptime(last_date, '%Y-%m-%d') + timedelta(days = 1), '%Y-%m-%d')
try:
os.remove('output.csv')
except:
pass
batch_ID_param = ""
while True:
batch = requests.get(url = URL + batch_ID_param, headers = headers)
parsed = json.loads(batch.text)
batch_ID_param = "?batchId=" + parsed['nextBatchId']
df = pd.DataFrame(parsed['runs'])
df[df.endTime <= last_date].to_csv('output.csv', mode = "a+", index = False, header = True)
if len(df.index) < 1000 or len(df.index) > len(df[df.endTime <= last_date].index):
break
Running a script like this periodically allows you to easily import fresh data into your tools for custom analysis. You can work with the data in a Domino project, or make it available to third party tools like Tableau: