Your agentic system needs access to one or more LLM endpoints. Before you start developing, decide how your agent will reach the models it depends on.
Domino supports two approaches:
-
Connecting to an external LLM provider.
-
Hosting a model inside Domino.
Most teams already have access to a managed LLM service. Your agent code calls the provider’s API directly. Domino runs the agent, and the LLM inference happens outside Domino.
Common providers include:
| Provider | SDK / package | Typical environment variable |
|---|---|---|
OpenAI |
|
|
Anthropic |
|
|
AWS Bedrock |
|
|
Azure OpenAI |
|
|
Google Vertex AI |
| Service account credentials |
How to configure access in Domino
-
Install the provider’s SDK in your Domino environment. Add the package to your environment’s Dockerfile instructions (for example,
RUN pip install openai) or install it in your workspace. -
Store API keys as Domino environment variables. In your project settings, add environment variables for your credentials. These are injected into workspaces and Jobs automatically. Your code reads them at runtime via
os.environ. This keeps secrets out of your code and version control. -
Use the provider’s SDK in your agent code the same way you would locally. The provider handles model hosting, scaling, and versioning.
import os
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider
# API key is read from a Domino environment variable
model = OpenAIChatModel(
"gpt-5.4-mini",
provider=OpenAIProvider(api_key=os.environ["OPENAI_API_KEY"]),
)
agent = Agent(model)
result = agent.run_sync("Hello")
print(result.output)If you need to run your own model for data residency, cost control, or to use a fine-tuned model, Domino can host it as a Model API. Your agent code calls the Domino-hosted endpoint the same way it would call any other API.
Register and deploy LLMs has the full setup guide, including:
-
Registering a model from Hugging Face or a custom checkpoint.
-
Deploying it as a Domino Model API endpoint.
-
Configuring GPU hardware and scaling.
Many production agents use both an external frontier model for primary reasoning and a Domino-hosted model for specialized tasks such as a fine-tuned classifier or embedding model.
The @add_tracing instrumentation in the develop step captures traces from all LLM calls regardless of where the model is hosted.
|
Tip
| You can swap models during experimentation. In Domino, model switching is a config change, not an infrastructure change. You can update a YAML file to move between external and Domino-hosted models, making it easy to evaluate cost, reliability, and performance as you iterate. |
-
Develop agentic systems: Instrument your agent code with tracing and evaluation.
-
Register and deploy LLMs: Host your own model in Domino.
