Set up LLM access

Your agentic system needs access to one or more LLM endpoints. Before you start developing, decide how your agent will reach the models it depends on.

Domino supports two approaches:

Connecting to an external LLM provider.
Hosting a model inside Domino.

Option A: Connect to an external LLM provider

Most teams already have access to a managed LLM service. Your agent code calls the provider’s API directly. Domino runs the agent, and the LLM inference happens outside Domino.

Common providers include:

Provider SDK / package Typical environment variable

Provider	SDK / package	Typical environment variable
OpenAI	`openai`	`OPENAI_API_KEY`
Anthropic	`anthropic`	`ANTHROPIC_API_KEY`
AWS Bedrock	`boto3`	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
Azure OpenAI	`openai`	`AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`
Google Vertex AI	`google-cloud-aiplatform`	Service account credentials

OpenAI

openai

OPENAI_API_KEY

Anthropic

anthropic

ANTHROPIC_API_KEY

AWS Bedrock

boto3

AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY

Azure OpenAI

openai

AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT

Google Vertex AI

google-cloud-aiplatform

Service account credentials

How to configure access in Domino

Install the provider’s SDK in your Domino environment. Add the package to your environment’s Dockerfile instructions (for example, RUN pip install openai) or install it in your workspace.
Store API keys as Domino environment variables. In your project settings, add environment variables for your credentials. These are injected into workspaces and Jobs automatically. Your code reads them at runtime via os.environ. This keeps secrets out of your code and version control.
Use the provider’s SDK in your agent code the same way you would locally. The provider handles model hosting, scaling, and versioning.

import os
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

# API key is read from a Domino environment variable
model = OpenAIChatModel(
    "gpt-5.4-mini",
    provider=OpenAIProvider(api_key=os.environ["OPENAI_API_KEY"]),
)
agent = Agent(model)

result = agent.run_sync("Hello")
print(result.output)

Option B: Host a model in Domino

If you need to run your own model for data residency, cost control, or to use a fine-tuned model, Domino can host it as a Model API. Your agent code calls the Domino-hosted endpoint the same way it would call any other API.

Registering a model from Hugging Face or a custom checkpoint.
Deploying it as a Domino Model API endpoint.
Configuring GPU hardware and scaling.

Mix approaches

Many production agents use both an external frontier model for primary reasoning and a Domino-hosted model for specialized tasks such as a fine-tuned classifier or embedding model.

The @add_tracing instrumentation in the develop step captures traces from all LLM calls regardless of where the model is hosted.

Tip	You can swap models during experimentation. In Domino, model switching is a config change, not an infrastructure change. You can update a YAML file to move between external and Domino-hosted models, making it easy to evaluate cost, reliability, and performance as you iterate.

Next steps

Develop agentic systems: Instrument your agent code with tracing and evaluation.
Register and deploy LLMs: Host your own model in Domino.

User Guide

Admin Guide

API Guide