AutoML

Train, evaluate, and deploy machine learning models through a unified no-code interface.

Overview

AutoML is a Domino extension that enables data scientists and domain experts to build, evaluate, and deploy machine learning models through a streamlined, no-code interface.

Powered by AutoGluon, AutoML automates the end-to-end model training pipeline - from data profiling and feature engineering through model selection, hyperparameter tuning, and ensembling - so you can go from raw data to a production-ready model in minutes.

AutoML provides two primary workflows:

Data Exploration: Upload a dataset, review column statistics and distributions, examine correlations, assess data quality, and apply transformations before training.
Model Training: Configure and run an AutoGluon training job that automatically trains multiple model types, ranks them on a leaderboard, and produces deployment-ready artifacts.

You can access AutoML from the left navigation sidebar of any Domino project under the Extensions section.

Get started

Prerequisites

A Domino project with the AutoML extension enabled.
A dataset in CSV or Parquet format (maximum file size: 550 MB).
Alternatively, a file stored in a mounted Domino Dataset.

Access AutoML

Navigate to your Domino project.
In the left sidebar, scroll down to the Extensions section.
Click AutoML.

The AutoML landing page displays all existing training jobs with their name, type, status, best model, and creation date. From here you can start a new training job or explore your data.

Tip	Use the Explore Data button in the top-right corner of the AutoML landing page to profile and transform your data before creating a training job.

Data Exploration

The Data Exploration tool lets you analyze data quality, distributions, and correlations, and prepare transformations before training a model.

To open it, click Explore Data from the AutoML landing page.

Upload data

You can provide data in one of two ways:

Upload File: Drag and drop (or click to browse) a CSV or Parquet file. The maximum file size is 550 MB.
Domino Dataset: Select a file from a mounted Domino Dataset already connected to your project.

Once a file is loaded, Domino automatically profiles the data and takes you to the Data Exploration interface. The file name, column count, and row count are displayed at the top of the page. You can switch to a different file at any time by clicking Change File.

Data Preview

The Data Preview tab shows a sample of your raw data in table format. You can browse columns and rows to get a quick sense of the dataset’s structure and values. Use the Rows per page control to adjust how many rows are displayed at a time.

Column Analysis

The Column Analysis tab provides detailed profiling information for each column in your dataset. Select a column from the list on the left to view its profile on the right.

AutoML automatically detects each column’s semantic type, displayed as a colored tag beneath the column name. Detected types include:

Type Description Color

Type	Description	Color
numeric	A column containing continuous numeric values (integers or floats).	Blue
monetary	A numeric column identified by name patterns related to financial values (e.g., `price`, `cost`, `amount`, `revenue`, `salary`).	Blue
binary	A numeric column with exactly two distinct values, commonly used as a classification target.	Purple
category	A column with a moderate number of distinct text or object values, or identified by name patterns (e.g., `type`, `class`, `status`, `group`).	Purple
categorical_numeric	A numeric column with fewer than 20 distinct values representing less than 5% of the total rows, treated as a categorical feature.	Purple
datetime	A column containing date or timestamp values, detected by data type or name patterns (e.g., `date`, `time`, `timestamp`, `created`, `updated`).	Green
text	A column containing long text strings (average length > 100 characters), or identified by name patterns (e.g., `description`, `comment`, `note`).	Orange
identifier	A column whose values are likely unique row identifiers, detected by name patterns (e.g., `id`, `uuid`, `guid`) or high cardinality. Identifiers are flagged with a warning and should typically be excluded from training.	Gray
boolean	A column containing true/false values, detected by data type or name patterns (e.g., `is_`, `has_`, `flag`).	Gray
unknown	A column whose type could not be confidently determined. Review these columns manually before training.	Gray

numeric

A column containing continuous numeric values (integers or floats).

Blue

monetary

A numeric column identified by name patterns related to financial values (e.g., price, cost, amount, revenue, salary).

Blue

binary

A numeric column with exactly two distinct values, commonly used as a classification target.

Purple

category

A column with a moderate number of distinct text or object values, or identified by name patterns (e.g., type, class, status, group).

Purple

categorical_numeric

A numeric column with fewer than 20 distinct values representing less than 5% of the total rows, treated as a categorical feature.

Purple

datetime

A column containing date or timestamp values, detected by data type or name patterns (e.g., date, time, timestamp, created, updated).

Green

text

A column containing long text strings (average length > 100 characters), or identified by name patterns (e.g., description, comment, note).

Orange

identifier

A column whose values are likely unique row identifiers, detected by name patterns (e.g., id, uuid, guid) or high cardinality. Identifiers are flagged with a warning and should typically be excluded from training.

Gray

boolean

A column containing true/false values, detected by data type or name patterns (e.g., is_, has_, flag).

Gray

unknown

A column whose type could not be confidently determined. Review these columns manually before training.

Gray

Additional types are detected but displayed with default styling:

email: Columns with email-related names
phone: Columns with phone-related names
url: Columns with URL-related names
latitude/longitude: Geographic coordinate columns
percentage: Columns with percentage-related names (e.g., percent, pct, ratio, rate)
name: Columns with name-related patterns (e.g., name, title, label)
count: Columns with count-related names (e.g., count, num, qty, quantity)

For each selected column, the following information is displayed:

Potential Issues: Warnings such as High cardinality – may be an identifier, High missing rate, or Constant or near-constant column.
Unique Values: The count and percentage of distinct values.
Missing: The count and percentage of null or missing entries.
Data Type: The underlying Pandas data type (e.g., int64, object, float64).
Statistics: For numeric columns: min, max, mean, median, standard deviation, skewness, and kurtosis.
Distribution: A histogram for numeric columns, or a bar chart of top values for categorical columns showing the count and percentage for each category.

Column Analysis page showing PumpId info

Column Analysis page showing PumpSize info

Correlations

The Correlations tab displays a correlation matrix heatmap for all numeric columns in the dataset. Correlation values range from −1 (strong negative correlation) to +1 (strong positive correlation), with color coding to highlight the strength and direction of each relationship.

Use the Min correlation slider to filter out weak correlations and focus on the strongest relationships. This view is helpful for identifying multicollinearity between features and understanding which variables are most associated with your target column.

Data Quality

The Data Quality tab gives you an at-a-glance assessment of your dataset’s readiness for model training. It includes the following sections:

Missing Values by Column: A horizontal bar chart showing the number and percentage of missing values for each column that has any. Columns are color-coded by severity: green for less than 5% missing, orange for 5–20%, yellow for 20–50%, and red for more than 50%.
Missing Value Pattern: A compact visual representation of where missing values occur across columns, helping you identify whether missingness is concentrated or scattered.
Warnings: Issues that could affect training quality. For example, Small dataset – consider using best_quality preset for better results.
Recommendations: Actionable suggestions to improve model performance. These are tagged by priority (high or medium) and category (Target, Preprocessing). Examples include:
- Potential binary classification targets (e.g., Failed, PumpSize).
- Potential multiclass classification targets (e.g., CriticalityLevel, ConnectedUnits, BackupSystems).
- Consider dropping or imputing columns with more than 30% missing values (e.g., WellSector).

Transformations

The Transformations tab lets you define preprocessing steps that will be applied to the dataset before model training. AutoML analyzes your data and suggests recommended transformations based on detected issues.

Recommended transformations

AutoML may recommend transformations for columns with the following issues:

Identifier columns: Columns where all values are unique and are likely row identifiers (e.g., PumpId). Recommended action: drop the column.
High cardinality: Columns that may be identifiers due to a very large number of unique values (e.g., ManufacturerModel). Recommended action: review and potentially drop.
Missing values: Columns with a significant percentage of null values (e.g., OperatingYears at 19.9% missing). Recommended action: fill missing values.
Extreme outliers: Numeric columns with extreme values detected (e.g., FlowRatePSI with 6.9% outliers). Recommended action: clip or remove outliers.
High missing rate: Columns with a very high percentage of missing data (e.g., WellSector at 77% missing). Recommended action: drop the column.

Click Add next to any recommended transformation to include it. You can also create custom transformations by selecting a column and a transformation type (e.g., Fill Missing Values) from the dropdown controls at the bottom of the page.

All selected transformations appear in the Selected Transformations list. Transformations are included in the exported notebook and can be reviewed and modified in code.

Export a Notebook

At any point during data exploration, you can click the Export Notebook button in the top-right corner to download a Jupyter notebook that contains all of your data profiling results and any transformations you have selected.

This notebook can be opened in a Domino Workspace for further analysis or custom preprocessing.

AutoML model training

AutoML model training uses AutoGluon to automatically train, tune, and rank multiple machine learning models. The training job wizard guides you through four steps: selecting your data source, choosing a model type, configuring training parameters, and reviewing your settings before launch.

To begin, click New training job from the AutoML landing page.

Step 1: Select a data source

Select the dataset you want to use for training. As with Data Exploration, you can either upload a CSV or Parquet file (up to 550 MB) or select a file from a mounted Domino Dataset.

Once your file is loaded, click Continue to proceed.

Select a data source for a new training job

Step 2: Select a model type

Choose the AutoGluon predictor that best fits your use case. Three model types are available:

Model type	Description	Example use cases
Tabular	For structured data with rows and columns. Supports classification and regression tasks.	Classification and regression, equipment failure prediction, well log analysis.
Time Series	For forecasting sequential data where observations are ordered over time.	Production forecasting, demand prediction, anomaly detection.
Multimodal	For mixed data types that combine images, text, and tabular data.	Seismic interpretation, document analysis, image + metadata.

Model type

Description

Example use cases

Tabular

For structured data with rows and columns. Supports classification and regression tasks.

Classification and regression, equipment failure prediction, well log analysis.

Time Series

For forecasting sequential data where observations are ordered over time.

Production forecasting, demand prediction, anomaly detection.

Multimodal

For mixed data types that combine images, text, and tabular data.

Seismic interpretation, document analysis, image + metadata.

Problem Type (Optional)

AutoGluon can auto-detect the problem type from your target column, but you can also specify it explicitly:

Problem type Description

Problem type	Description
Binary Classification	Predict one of two classes (e.g., `Failed: 0 or 1`).
Multiclass Classification	Predict one of multiple classes (e.g., `CriticalityLevel: 1, 2, or 3`).
Regression	Predict a continuous value (e.g., `FlowRatePSI`).

Binary Classification

Predict one of two classes (e.g., Failed: 0 or 1).

Multiclass Classification

Predict one of multiple classes (e.g., CriticalityLevel: 1, 2, or 3).

Regression

Predict a continuous value (e.g., FlowRatePSI).

Click Continue to proceed to configuration.

Select a model type for a new training job

Step 3: Configure Training

The configuration step lets you define your training job’s name, target column, and AutoGluon-specific settings.

Basic configuration

Setting Description

Setting	Description
Job Name	A descriptive name for this training run (e.g., `My training job`).
Description (optional)	A free-text description to help you identify this job later.
Target Column	The column in your dataset that the model should predict. Select it from the dropdown list of available columns.

Job Name

A descriptive name for this training run (e.g., My training job).

Description (optional)

A free-text description to help you identify this job later.

Target Column

The column in your dataset that the model should predict. Select it from the dropdown list of available columns.

AutoGluon settings

Setting Description

Setting	Description
Preset	Controls the trade-off between model quality and training speed. Options include `Medium Quality (Faster)`, `Best Quality`, and others. `Medium Quality (Faster)` is the default.
Time Limit (seconds)	Maximum wall-clock time (in seconds) for the entire training run. Default: `3600 (1 hour)`. AutoGluon will train as many models as possible within this limit.
Evaluation Metric (optional)	The metric used to rank models on the leaderboard. Set to `Auto-detect` by default, which chooses an appropriate metric based on the problem type (e.g., `accuracy` for classification, `RMSE` for regression).
Experiment Name (optional)	An optional Domino Experiment name for tracking this run. If left blank, a name is auto-generated.

Preset

Controls the trade-off between model quality and training speed. Options include Medium Quality (Faster), Best Quality, and others. Medium Quality (Faster) is the default.

Time Limit (seconds)

Maximum wall-clock time (in seconds) for the entire training run. Default: 3600 (1 hour). AutoGluon will train as many models as possible within this limit.

Evaluation Metric (optional)

The metric used to rank models on the leaderboard. Set to Auto-detect by default, which chooses an appropriate metric based on the problem type (e.g., accuracy for classification, RMSE for regression).

Experiment Name (optional)

An optional Domino Experiment name for tracking this run. If left blank, a name is auto-generated.

Configure the training of a new training job

Advanced Configuration

Click Advanced Configuration to access fine-grained controls organized across multiple tabs. Available tabs vary based on the selected model type.

Tabs available for all model types

Resources

Configure compute resources allocated to the training job.

Setting Description

Setting	Description
Number of GPUs	Set to `0` for CPU-only training. Increase for GPU-accelerated models.
Number of CPUs	Leave empty for automatic detection based on available hardware.
Verbosity Level	Controls logging detail: `0` (Silent), `1` (Errors only), `2` (Normal), `3` (Detailed), `4` (Debug). Default: `2`.
Cache Data	Cache data in memory for faster training. Enabled by default.

Number of GPUs

Set to 0 for CPU-only training. Increase for GPU-accelerated models.

Number of CPUs

Leave empty for automatic detection based on available hardware.

Verbosity Level

Controls logging detail: 0 (Silent), 1 (Errors only), 2 (Normal), 3 (Detailed), 4 (Debug). Default: 2.

Cache Data

Cache data in memory for faster training. Enabled by default.

Tabs available for tabular models only

Models

Select or exclude specific model families from the training run.

Setting Description

Setting	Description
Excluded Model Types	Click to exclude model types from training (highlighted in red). Available models: LightGBM, CatBoost, XGBoost, Random Forest, Extra Trees, K-Nearest Neighbors, Linear Regression, Neural Network (PyTorch), Neural Network (FastAI).
Bagging Folds	Number of folds for bagging (2–10). Set to `Auto` by default.
Stack Levels	Number of stacking levels (0–3). Higher values increase ensemble complexity.
Auto Stack	Automatically determine optimal stacking configuration.

Excluded Model Types

Click to exclude model types from training (highlighted in red). Available models: LightGBM, CatBoost, XGBoost, Random Forest, Extra Trees, K-Nearest Neighbors, Linear Regression, Neural Network (PyTorch), Neural Network (FastAI).

Bagging Folds

Number of folds for bagging (2–10). Set to Auto by default.

Stack Levels

Number of stacking levels (0–3). Higher values increase ensemble complexity.

Auto Stack

Automatically determine optimal stacking configuration.

Training

Fine-tune the training process.

Setting Description

Setting	Description
Holdout Fraction	Fraction of data reserved for validation (0.01–0.5). Typically 0.1–0.2. Set to `Auto` by default.
Inference Time Limit (s/row)	Maximum inference time per row in seconds. Leave blank for no limit.
Calibrate Probabilities	Calibrate predicted probabilities for better reliability. Useful when probability outputs will be used for decision-making.
Refit on Full Data	After training, refit the best models on the full dataset (training + validation) for maximum performance.
Use Bag Holdout	Use a separate holdout for bagged models, which can improve ensemble quality.

Holdout Fraction

Fraction of data reserved for validation (0.01–0.5). Typically 0.1–0.2. Set to Auto by default.

Inference Time Limit (s/row)

Maximum inference time per row in seconds. Leave blank for no limit.

Calibrate Probabilities

Calibrate predicted probabilities for better reliability. Useful when probability outputs will be used for decision-making.

Refit on Full Data

After training, refit the best models on the full dataset (training + validation) for maximum performance.

Use Bag Holdout

Use a separate holdout for bagged models, which can improve ensemble quality.

Hyperparameter Optimization (HPO)

Enable and configure hyperparameter tuning. When enabled, AutoGluon searches over hyperparameter combinations for each model family.

Setting Description

Setting	Description
Enable HPO	Toggle hyperparameter optimization on/off.
HPO Scheduler	Local (single machine) or Ray (distributed across multiple workers).
Search Algorithm	Auto (recommended), Random Search, Bayesian Optimization, or Grid Search.
Number of Trials	More trials = better results but longer training (1–100). Default: `10`.
Max Iterations per Trial	Maximum training iterations for each trial. Leave blank for auto.
Grace Period	Minimum iterations before early stopping can occur (ASHA scheduler).
Reduction Factor	Factor by which to reduce the number of trials at each rung. Typically `3`.

Enable HPO

Toggle hyperparameter optimization on/off.

HPO Scheduler

Local (single machine) or Ray (distributed across multiple workers).

Search Algorithm

Auto (recommended), Random Search, Bayesian Optimization, or Grid Search.

Number of Trials

More trials = better results but longer training (1–100). Default: 10.

Max Iterations per Trial

Maximum training iterations for each trial. Leave blank for auto.

Grace Period

Minimum iterations before early stopping can occur (ASHA scheduler).

Reduction Factor

Factor by which to reduce the number of trials at each rung. Typically 3.

Per-Model Hyperparameters: Override default hyperparameters for specific model types (LightGBM, XGBoost, CatBoost, Neural Network) using JSON format.

Threshold

For binary classification tasks, optimize the decision threshold.

Setting Description

Setting	Description
Enable Threshold Calibration	Find optimal decision threshold instead of using the default `0.5`.
Optimization Metric	Metric to optimize: `Balanced Accuracy` (default), `F1 Score`, `Precision`, `Recall`, or `Matthews Correlation Coefficient`.
Thresholds to Try	Number of threshold values to evaluate (10–1000). Default: `100`.

Enable Threshold Calibration

Find optimal decision threshold instead of using the default 0.5.

Optimization Metric

Metric to optimize: Balanced Accuracy (default), F1 Score, Precision, Recall, or Matthews Correlation Coefficient.

Thresholds to Try

Number of threshold values to evaluate (10–1000). Default: 100.

Imbalance

Configure how AutoGluon handles class imbalance.

Setting	Description
Imbalance Strategy	None (default), Oversample (duplicate minority class), Undersample (reduce majority class), SMOTE (synthetic minority oversampling), or Focal Loss (down-weight easy examples).
Sample Weight Column	Name of a column containing sample weights for weighted training.

Setting

Description

Imbalance Strategy

None (default), Oversample (duplicate minority class), Undersample (reduce majority class), SMOTE (synthetic minority oversampling), or Focal Loss (down-weight easy examples).

Sample Weight Column

Name of a column containing sample weights for weighted training.

Foundation

Options for foundation model-based approaches.

Setting Description

Setting	Description
Use Tabular Foundation Models	Include `TabPFN` and other foundation models in training.
Foundation Model Preset	None (use with other models), Zero-shot (instant predictions without training), or Zero-shot + HPO (optimize foundation model parameters).
Dynamic Stacking	Use dynamic stacking for adaptive ensemble configurations.
Pseudo-Labeling	Enable semi-supervised learning with unlabeled data. Requires specifying an unlabeled data path.
Drop Unique Features	Automatically drop high-cardinality unique features (like IDs).

Use Tabular Foundation Models

Include TabPFN and other foundation models in training.

Foundation Model Preset

None (use with other models), Zero-shot (instant predictions without training), or Zero-shot + HPO (optimize foundation model parameters).

Dynamic Stacking

Use dynamic stacking for adaptive ensemble configurations.

Pseudo-Labeling

Enable semi-supervised learning with unlabeled data. Requires specifying an unlabeled data path.

Drop Unique Features

Automatically drop high-cardinality unique features (like IDs).

Advanced

Additional low-level AutoGluon parameters for experienced users.

Setting	Description
Enable Distillation	Transfer knowledge from the ensemble to a single faster model for deployment.
Distillation Time Limit	Time allocated for distillation (seconds). Leave blank for auto.
Include Only Specific Models	Whitelist specific model types to include (overrides excluded models). Selected models appear in green.
Bagging Sets	Number of complete bagging sets (increases diversity).
Use Refit as Best	Use the refitted model as the final predictor.

Setting

Description

Enable Distillation

Transfer knowledge from the ensemble to a single faster model for deployment.

Distillation Time Limit

Time allocated for distillation (seconds). Leave blank for auto.

Include Only Specific Models

Whitelist specific model types to include (overrides excluded models). Selected models appear in green.

Bagging Sets

Number of complete bagging sets (increases diversity).

Use Refit as Best

Use the refitted model as the final predictor.

Tabs available for time series models only

Time Series

Configure time series-specific settings.

Setting Description

Setting	Description
Frequency	Data frequency: `Auto-detect`, `Daily`, `Weekly`, `Monthly`, `Hourly`, `Minutely`, `Quarterly`, or `Yearly`.
Target Scaler	Scaling method for target values: `Default`, `Mean Absolute`, `Standard`, `Min-Max`, or `No Scaling`.
Use Chronos	Enable Amazon’s Chronos foundation model for time series forecasting.
Chronos Model Size	Model size when Chronos is enabled: `Tiny` (8M params), `Mini` (20M), `Small` (46M), `Base` (200M), or `Large` (710M). Larger models are more accurate but require more memory.
Enable Ensemble	Combine multiple time series models into an ensemble. Enabled by default.

Frequency

Data frequency: Auto-detect, Daily, Weekly, Monthly, Hourly, Minutely, Quarterly, or Yearly.

Target Scaler

Scaling method for target values: Default, Mean Absolute, Standard, Min-Max, or No Scaling.

Use Chronos

Enable Amazon’s Chronos foundation model for time series forecasting.

Chronos Model Size

Model size when Chronos is enabled: Tiny (8M params), Mini (20M), Small (46M), Base (200M), or Large (710M). Larger models are more accurate but require more memory.

Enable Ensemble

Combine multiple time series models into an ensemble. Enabled by default.

Tabs available for multimodal models only

Multimodal

Configure multimodal-specific settings for combined image, text, and tabular data.

Setting Description

Setting	Description
Text Backbone	Pre-trained text model (e.g., `google/electra-base-discriminator`).
Image Backbone	Pre-trained image model (e.g., `swin_base_patch4_window7_224`).
Max Text Length	Maximum token length for text inputs (32–2048). Default: `512`.
Image Size	Input image size in pixels (32–512). Default: `224`.
Batch Size	Training batch size. Leave blank for auto.
Max Epochs	Maximum training epochs. Leave blank for auto.
Learning Rate	Model learning rate. Leave blank for auto.
Fusion Method	How to combine modalities: `Late Fusion` (combine at prediction) or `Early Fusion` (combine at feature level).

Text Backbone

Pre-trained text model (e.g., google/electra-base-discriminator).

Image Backbone

Pre-trained image model (e.g., swin_base_patch4_window7_224).

Max Text Length

Maximum token length for text inputs (32–2048). Default: 512.

Image Size

Input image size in pixels (32–512). Default: 224.

Batch Size

Training batch size. Leave blank for auto.

Max Epochs

Maximum training epochs. Leave blank for auto.

Learning Rate

Model learning rate. Leave blank for auto.

Fusion Method

How to combine modalities: Late Fusion (combine at prediction) or Early Fusion (combine at feature level).

Summary of tabs availability by model type:

Tab	Tabular	Time Series	Multimodal
Resources	✓	✓	✓
Models	✓	-	-
Training	✓	-	-
HPO	✓	-	-
Threshold	✓	-	-
Imbalance	✓	-	-
Foundation	✓	-	-
Advanced	✓	-	-
Time Series	-	✓	-
Multimodal	-	-	✓

Tab

Tabular

Time Series

Multimodal

Resources

✓

Models

✓

Training

✓

HPO

✓

Threshold

✓

Imbalance

✓

Foundation

✓

Advanced

✓

Time Series

✓

Multimodal

✓

Advanced configuration of the model training

Step 4: Review and launch

Review all your selected settings on the summary screen. If everything looks correct, click Start Training to launch the job.

You will be taken to the training run’s overview page where you can monitor progress in real time.

Training results

Once a training job is launched, its results page provides comprehensive information about the run’s progress and outputs.

The results page is organized into several tabs: Overview, Leaderboard, Diagnostics, Metrics, Outputs, and Logs.

The Overview tab displays the training job’s metadata and real-time progress. While the job is running, a progress bar shows the estimated completion percentage and a Cancel button is available to stop the run.

The metadata table includes:

Field Description

Field	Description
Run ID	A unique identifier for this training run.
Model Type	The predictor type (e.g., `Tabular`).
Problem Type	The detected or specified problem type (e.g., `Binary`).
Target Column	The column being predicted (e.g., `Failed`).
Preset	The quality preset used (e.g., `Medium Quality Faster Train`).
Time Limit	The configured time limit in seconds (e.g., `3600s`).
Created	The date and time the job was launched.
Duration	Total elapsed time for the training run.
Status	Current status: `Running`, `Completed`, `Failed`, or `Cancelled`.

Run ID

A unique identifier for this training run.

Model Type

The predictor type (e.g., Tabular).

Problem Type

The detected or specified problem type (e.g., Binary).

Target Column

The column being predicted (e.g., Failed).

Preset

The quality preset used (e.g., Medium Quality Faster Train).

Time Limit

The configured time limit in seconds (e.g., 3600s).

Created

The date and time the job was launched.

Duration

Total elapsed time for the training run.

Status

Current status: Running, Completed, Failed, or Cancelled.

Deploy a model

After training is complete, you can deploy the best model directly from the AutoML interface.

Export a Deployment Package

Navigate to the Outputs tab of your completed training run.
Verify the output directory path under Deployment Package.
(Optional) Check Optimize for inference to reduce model size and improve prediction latency.
Click Create Package.

The generated package is saved to the specified directory in your Domino project’s dataset storage. It includes the model artifacts, inference script, requirements file, and Dockerfile.

Register a Model

To register your model in Domino’s Model Registry for versioning, governance, and deployment:

Click the Register button in the top-right corner of the training results page.
In the Deploy to Model Registry dialog, enter a Model Name.
Select the Model Type (e.g., Tabular).
Optionally add a description to document the model’s purpose and training context.
Click Register to save the model to the registry.

Once registered, the model appears in the Models section of your Domino project and can be deployed as an API endpoint, used in batch scoring, or shared across teams.

Best practices

Review data quality before training. Use the Data Exploration tool to inspect missing values, outliers, and column types. Address high-priority recommendations (especially dropping or imputing columns with more than 30% missing data) before starting a training job.
Choose the right preset. The Medium Quality (Faster) preset is a good starting point for rapid iteration. Once you have identified a promising dataset and target, switch to Best Quality for production models, especially on smaller datasets where the additional training time yields meaningful improvements.
Set an appropriate time limit. A longer time limit allows AutoGluon to train more models and explore more hyperparameter configurations. For initial exploration, 600–1800 seconds is often sufficient. For production runs, consider 3600 seconds or more.
Exclude identifier columns. Columns with all unique values (flagged as identifiers) do not contribute meaningful signal and should be dropped before training. AutoML will flag these in both Column Analysis and Recommended Transformations.
Examine feature importance. After training, review the Feature Importance chart on the Diagnostics tab. If unimportant features dominate, consider removing them and retraining to reduce noise and improve generalization.
Consider the prediction time trade-off. Ensemble models (e.g., WeightedEnsemble_L2) typically achieve the best validation scores but may have higher inference latency. For real-time applications, compare leaderboard scores with prediction times and consider selecting a simpler model that meets your latency requirements. Use the exported notebook for reproducibility. Download the Training Notebook from the Outputs tab to preserve a complete record of your training configuration. This notebook can be re-run in a Domino Workspace and serves as the starting point for any custom modifications.

User Guide

Admin Guide

API Guide

Release Notes

AutoML

Overview

Get started

Prerequisites

Access AutoML

Data Exploration

Upload data

Data Preview

Column Analysis

Correlations

Data Quality

Transformations

Export a Notebook

AutoML model training

Step 1: Select a data source

Step 2: Select a model type

Step 3: Configure Training

Basic configuration

AutoGluon settings

Advanced Configuration

Step 4: Review and launch

Training results

Deploy a model

Export a Deployment Package

Register a Model

Best practices