Prediction data is a combination of the inputs to the model and the predictions that are output from the model. Inputs are the values of the features that were input as API requests into the model API. When you incorporate a Domino-provided data capture library in your model API code, Domino automatically captures the prediction data.
The data ingestion client is part of the Domino Standard Environment (DSE) with the latest version of the client library. The client library records prediction data for deployed models.
To add the DominoDataCapture library details to your model, you must add the following lines so that this logic will be executed when the model is deployed. See call a model for details.
In the navigation pane, click Projects.
In the navigation pane, click Workspaces.
Start the appropriate workspace.Note
Edit your prediction code:
Edit your prediction code to add the DataCaptureClient. See the following examples for Python and R:
import datetime import uuid from domino_data_capture.data_capture_client import DataCaptureClient feature_names = ["dropperc", "mins", "consecmonths", "income", "age"] feature_values = ["dropperc", "mins", "consecmonths", "income", "age"] features_dict = dict(zip(feature_names, feature_values)) predict_names = ["y"] predict_values =  predict_dict = dict(zip(predict_names, predict_values)) # Record eventID and current time event_id = uuid.uuid4() event_time = datetime.datetime.now(datetime.timezone.utc).isoformat() # Custom metadata I want to track for this event metadata_names = ["cohort"] metadata_values = ["cohort_id"] prediction_probability = [0.1, 0.9] sample_weight = 0.3 data_capture_client = DataCaptureClient(feature_names, predict_names, metadata_names) data_capture_client.capturePrediction( feature_values, predict_values, metadata_values=metadata_values, event_id=event_id, timestamp=event_time, prediction_probability=prediction_probability, sample_weight=sample_weight, )
The following table explains the parameters from the
The feature names against which the user will calculate the prediction.
The prediction names collection. This value must be an array.
Collection of any metadata keys to pass.
The following table explains the parameters from the
The feature values against which the user will calculate the prediction.
The prediction values collection. This value must be an array.
Collection of any metadata values to pass.
A unique record ID for each prediction. If not provided the client library generates one.
The event timestamp. If not provided the client library generates it.
The collection of prediction probabilities. This value must be an array.
The collection of associated sample weights. This value must be an array.
DominoDataCapture library to capture prediction data in the model API or in developer mode. Use developer mode to test the library calls to verify that the data capture will work, without actually capturing data. After verifying that the data capture will work, you must invoke the model API code in a workspace (for example, an iPython notebook), where you can review the output of the library calls, validate, and debug the code.
Step 1: Run DataCaptureClient in developer mode
Open a Python Prediction Client workspace.
Go to New > Python3.
Add the following lines and update them for your model:
Import the predict function:
from python_model_with_logging import *
Invoke the predict method with parameters:
Step 2: Run DataCaptureClient in model API
The following are examples of models that use Domino data capture:
import datetime import pickle import pandas as pd import uuid from sklearn import metrics from domino_data_capture.data_capture_client import DataCaptureClient feature_names = ['sepal.length', 'sepal.width', 'petal.length', 'petal.width'] predict_names = ['variety'] pred_client = DataCaptureClient(feature_names, predict_names) loaded_model = pickle.load(open("model.pkl", 'rb')) def predict_iris_variety(sepal_length, sepal_width, petal_length, petal_width, event_id): feature_values = [sepal_length, sepal_width, petal_length, petal_width] predict_values = loaded_model.predict([feature_values]) event_time = datetime.datetime.now(datetime.timezone.utc).isoformat() pred_client.capturePrediction(feature_values, predict_values, event_id=event_id, timestamp=event_time) return dict(predict_value=predict_values)
See more examples of MLflow-supported models that use Domino data capture:
If you want to use a specific version of the client library, or enable client libraries in another environment:
In your Environment, click Edit Definition.
In the Dockerfile Instructions, add the following lines to enable the library:
USER root RUN pip install domino-data-capture USER ubuntu
Select Full rebuild without cache and click Build.
From the navigation bar, click Model APIs.
Click New Model and create the model from the newly built image.
See Validate your Setup to confirm your prediction data is being captured.
After you publish your model API and it is running, call the model API endpoint to capture prediction data.
Go to the model API to test.
From the Tester tab, enter the values from your model’s schema.
Click Send. The Response field shows a prediction in the form of key-value pair.
After the logged predictions are captured and processed by Domino, you can see a preview of the drift results. See Validate Your Setup for more information.
See the Administration Guide for configuration keys that tune the prediction data capture feature.