Export Compute and Spend data with the API

The Compute and Spend report provides several views about deployment usage, broken down by hardware tier, project, or user. However, if you want to do a more detailed, custom analysis, you can use the API to export Control Center data for examination with Domino’s data science features or external business intelligence applications.

The endpoint that serves this data is /v4/gateway/runs/getByBatchId.

See REST documentation on this endpoint, or continue reading for a detailed description and examples.

Prerequisites

  • The API key for your account.

  • An administrator account to access the full deployment’s Control Center data.

Get the API key

  1. Log in as an administrator, then click Account > Account Settings.

  2. Under Account Settings, click API Key. Click Regenerate if you have not generated a key before.

  3. Copy the key and store it carefully as you will need it to make requests to the API.

    Caution
    Anyone with this key can authenticate to the Domino API as you so treat it like a sensitive password.

Domino recommends repeating this process every 60-days to maintain security best practices.

Use the data gateway endpoint

The following is a basic call to the data export endpoint, executed with curl:

curl --include -H "X-Domino-Api-Key: <your-api-key>"
'https://<your-domino-url>/v4/gateway/runs/getByBatchId'

By default, the endpoint starts with the oldest available run data, starting from January 1st, 2018. Older data isn’t available. The command also has a default limit of 1000 runs worth of data. As written, the preceding call returns data on the oldest 1000 runs available.

To try this example, enter your-api-key and your-domino-url in the previous command.

The standard JSON response object you receive has the following scheme:

{
  "runs": [
    {
      "batchId": string,
      "runId": string,
      "title": string,
      "command": string,
      "status": string,
      "runType": string,
      "userName": string,
      "userId": string,
      "projectOwnerName": string,
      "projectOwnerId": string,
      "projectName": string,
      "projectId": string,
      "runDurationSec": integer,
      "hardwareTier": string,
      "hardwareTierCostCurrency": string,
      "hardwareTierCostAmount": number,
      "queuedTime": date-time ,
      "startTime": date-time,
      "endTime": date-time,
      "totalCostCurrency": string,
      "totalCostAmount": number,
      "computeClusterDetails : {
        "computeClusterType": string,
        "masterHardwareTierId": string,
        "masterHardwareTierCostPerMinute": number,
        "workerCount": integer,
        "workerHardwareTierId": string,
        "workerHardwareTierCostPerMinute": number
      }
    }
 ],
  "nextBatchId": string
}

The Control Center gives each run recorded a batchId, which is an incrementing field that can be used as a cursor to fetch data in multiple batches. You can see in the previous response, after the array of runs objects, a nextBatchId parameter points to the next run to include. Use that ID as a query parameter in a subsequent request to get the next batch:

curl --include
-H "X-Domino-Api-Key: <your-api-key>"
'https://<your-domino-url>/v4/gateway/runs/getByBatchId?batchId=<your-batchId-here>'

You can also include a header with Accept: text/csv to request the data as CSV. On the Unix shell, you can write the response to a file with the > operator. This is a quick way to get data suitable for import into analysis tools:

curl --include
-H "X-Domino-Api-Key: <your-api-key>"
-H 'Accept: text/csv'
'https://<your-domino-url>/v4/gateway/runs/getByBatchId' > your_file.csv

Get all Workspaces

Use the following API call to retrieve all workspaces.

GET /controlCenter/utilization/runsPerDay

Note
The <code>com.cerebro.domino.controlCenter.cacheTimeToLiveInMinutes</code> configuration key specifies the cache refresh as 30 minutes, by default. This might cause a delay in retrieving some workspaces.
curl -X GET "<your-domino-url>/v4/controlCenter/utilization/runs?startDate=20220503&endDate=20220503" -H  "accept: application/json"
ParameterRequiredDescription

projectId

startingUserId

organizationId

hardwareTierId

startDate

Range must be in YYYYMMDD format

endDate

Range must be in YYYYMMDD format

Example: Get all data

The following code shows a Python script that fetches all Control Center data from the earliest available to a configurable end date, and writes it to a CSV file. Enter the date of the last known completed run to fetch all available historical data.

import requests
import json
import pandas as pd
import os
from datetime import datetime
from datetime import timedelta

URL = "https://<your-domino-url>/v4/gateway/runs/getByBatchId"
headers = {'X-Domino-Api-Key': '<your-api-key>'}
last_date = 'YYYY-MM-DD'

last_date = datetime.strftime(
    datetime.strptime(last_date, '%Y-%m-%d') + timedelta(days = 1),
    '%Y-%m-%d',
)

try:
  os.remove('output.csv')
except:
  pass

batch_ID_param = ""
while True:
    batch = requests.get(url = URL + batch_ID_param, headers = headers)
    parsed = json.loads(batch.text)
    batch_ID_param = "?batchId=" + parsed['nextBatchId']
    df = pd.DataFrame(parsed['runs'])
    df[df.endTime <= last_date].to_csv(
        'output.csv',
        mode="a+",
        index=False,
        header=True)
    if len(df.index) < 1000 or len(df.index) > len(df[df.endTime <= last_date].index):
        break

You can run a script like this periodically to import fresh data into your tools for custom analysis. Work with the data in a Domino project, or make it available to third-party tools like Tableau.