Model monitoring ingests and processes a model’s training, prediction, and ground truth data to monitor the model. You must set up the data source from which the data is read. After you set up the data source, it is available to all users and can be used in any model.
You can have multiple data sources linked to the model monitor, and multiple instances for each data source type.
Note
|
Enable read and list access for each data source. |
The Model Monitor supports data from the following sources:
-
Amazon S3
-
Azure Blob
-
Azure Data Lake Gen 1
-
Azure Data Lake Gen 2
-
Google Cloud Storage
-
HDFS
-
Snowflake
-
If you haven’t set up a data source, go to the Data page (from the navigation pane, go to Model Monitor > Monitoring Data Sources).
-
Click Add Data Source.
-
Complete the details as needed. The following describes source-specific configurations:
Data source Required fields Authentication Amazon S3
Data Source Name
S3 Bucket Name
S3 Region
If the S3 buckets can be authenticated to using IAM roles, select the Load Credentials from IAM Role attached to the instance checkbox.
Enter an AWS Access Key and an AWS Secret Key.
Azure Blob Store
Data Source Name
Account name
Container
Access Key
Saas Token
Azure Data Lake Gen 1
Data Source name
Container
Select the Authentication Type:
-
If you select Client Credentials, enter:
-
Token Endpoint
-
Application ID
-
Secret Key
-
-
If you select Managed Identity , you can enter an optional Port Number.
This method applies when the Model Monitor is deployed on Azure VMs configured with service identities that can access Azure Data Lake.
Azure Data Lake Gen 2
Data Source name
Account Name
Container
Select the Authentication Type:
-
Shared Key.
-
Client Credentials. Then, enter:
-
Endpoint
-
Client ID
-
Client Secret
-
-
Username Password. Then, enter:
-
Endpoint
-
Username
-
Password
-
-
Refresh Token. Then, enter:
-
Refresh Token
-
Client ID
-
-
Managed Identity. This method applies when the Model Monitor has been deployed on Azure VMs configured with service identities that can access Azure Data Lake Gen2. Then, you can enter the following optional information:
-
Tenant ID
-
Client ID
-
Google Cloud Storage
Data Source Name
Bucket
JSON Key File
HDFS
Data Source Name
Host
Port (optional)
Snowflake
All fields
Enter the following:
-
Account URL which uniquely identifies the Snowflake account in your organization.
-
Enter your Username and Password.
-
In Database, enter the name of the Snowflake database that contains the data.
-
In Schema, enter the name of the active schema for the session.
-
In Warehouse, enter the name of a compute resource cluster that provides the resources in Snowflake.
-
Select a Role.
NoteAll fields must match Snowflake’s requirements for object identifiers.
-
-
Click Add.