Set up Cohort Analysis for Model Monitor

Cohort analysis provides information about your model, including feedback about what might be going wrong with it. Cohort analysis is a job that runs on the model’s prediction and ground truth data for a regression model. The result is a PDF report and raw data in JSON format.

To use Cohort Analysis, you must register ground truth data to enable data analysis for each regression model. Doing this for a model starts a series of ingest and analysis jobs.

Note

This feature is only available for regression models with numerical features.

Set up Cohort Analysis for this Model Monitor

Prerequisites

Set up Cohort Analysis

  1. From the navigation pane, click Model Monitor.

  2. Click the name of the model for which you want to set up Cohort Analysis.

  3. Click Model Quality.

  4. Click Register Ground Truth Data > Upload Ground Truth Config.

  5. In the Register Ground Truth window, click Register with Cohort Analysis.

Check the status of the data ingestion

After you set up the data analysis for a regression model, you might want to check its status.

  1. From the navigation pane, click Deploy > Model Monitor.

  2. Click the name of the model for which you set up Cohort Analysis.

  3. Click Ingest History to check to see if the status is Done.

After the data is ingested, Domino finds underperforming cohorts and the features that make those cohorts distinct from the rest of the data.

Configure the Cohort Analysis

You can configure the cohort analysis for a model to customize the report.

Prerequisites

Configure the Cohort Analysis

  1. In the navigation pane, click Projects.

  2. Click the DominoCohortAnalysis project. The DominoCohortAnalysis Project is created automatically as a private project under the model’s project owner’s account.

  3. In the navigation pane, click Code.

  4. Click config.yaml and then click Edit.

    You can configure the following parameters:

    min_k

    The minimum number of cohorts.

    max_k

    The maximum number of cohorts.

    max_samples_for_clustering

    The maximum number of samples to use to find cohorts

    num_bins

    The number of bins to use to compute the feature histograms and the contrast score.

    max_num_top_cohorts_for_report

    The maximum number of cohorts to show in the Cohort Summary and Detailed Cohort Analysis sections of the Cohort Analysis report.

    max_num_top_cohorts_for_report

    The maximum number of features to show per cohort in the Detailed Cohort Analysis section of the Cohort Analysis report.

  5. Click Save.

  6. In the navigation pane, click Jobs.

  7. Select the cohort_analysis job and click Run to generate new results.

See the additional Cohort Analysis options in the Administration Guide.