Domino endpoint drift detection

To monitor a model’s data drift you must ingest training data. Training data is the data used while constructing a model that leverages machine learning techniques. Training data records must include both model input values and a previously-known corresponding output for each input value combination from which the model can learn.

Your model’s training data is used to analyze data drift. This topic describes how to register a training set for monitoring purposes.

Prerequisites

Before you proceed, you must create a training set that was used to build a model (.pkl file), and then that model was deployed into a Domino endpoint.

Select the training set

  1. In the top navigation pane, click Deploy > Endpoint.

  2. Select the endpoint you would like to configure and navigate to the Grafana Monitoring tab.

  3. Click Configure Monitoring > Data.

  4. From the Configure Data window, select the Training Data and Version.

  5. Select Classification or Regression, depending on your model type. If your Training Set code includes prediction data defined in target_columns, select the model type that matches your Training Set:

    • If target_columns is a categorical column, select Classification.

    • If target_columns is a numerical column, select Regression.

  6. Click Save.

Note
When using the Configure Data window to register a model for monitoring, the default binning strategy will be used to create bins for both numerical and categorical variables. To change the binning strategy, including the number of bins to use for numerical or categorical variables, reach out to your Domino representative. See Supported binning methods for more details on binning.

Validate your Setup

Domino captures prediction data in a Domino Dataset in hourly batches.

Validate your setup without waiting for the Scheduled Check to run:

  1. On the Model API’s Monitoring tab, go to Configure Monitoring > Target Ranges.

  2. If the Configure Tests and Thresholds page populates with data, then your setup is complete. See Analyze Data Drift to learn how to configure tests and thresholds.