This guide walks you through building and deploying a chatbot powered by large language models (LLMs) using the Streamlit framework in Domino.
The app combines a couple of models hosted on Hugging Face with a model served via a Domino endpoint through a single, interactive UI.
-
Set up API access and compute environment prerequisites
-
Use Domino’s example code to build a Streamlit chatbot with selectable model backends
-
Integrate Hugging Face and Domino endpoints
-
Configure app startup and publish it in Domino
-
Share the app with others in your organization
Before you begin, make sure you have the following in place:
-
Hugging Face API key
Get a Hugging Face API key to access models hosted on Hugging Face:-
Create your account on Hugging Face.
-
Get your Access token from Hugging Face.
-
In Domino, go to Account Settings > Environment Variables and add a user-level variable:
-
Name:
HUGGING_FACE_API_TOKEN
-
Value:
your key
-
-
-
Domino compute environment
If you’re using a Domino Standard Environment (DSE), these are already included. Your environment must include:-
Jupyter or JupyterLab
-
The jupyter-server-proxy package
-
Streamlit and necessary Python libraries
-
-
Domino endpoint
To demonstrate multi-model access, this guide uses:-
Hugging Face-hosted models, such as Llama 3.2 3B and Mistral 7B.
-
A model deployed as a Domino-hosted prediction endpoint. It is assumed you already have a Domino endpoint configured and available.
-
To run and publish your Streamlit chatbot, first configure a Domino environment with the required packages. Streamlit apps are launched through a streamlit run and are hosted at port 8501
by default. You’ll configure this to use Domino’s required port 8888
later.
-
Go to Environments > Create Environment.
-
Enter a name and description.
-
Select a base image, such as the Domino Standard Environment.
-
In the Dockerfile, add the following if not already present:
USER root RUN pip install --no-cache-dir \ streamlit \ jupyter-server-proxy \ requests \ pandas USER ubuntu RUN pip install transformers torch streamlit
-
Click Build and wait for the environment to finish building.
You now have a compute environment with Streamlit and proxy support, ready to run your chatbot app.
Next, you’ll create a new Domino project and begin building your chatbot app using Streamlit. You’ll start with a basic layout: a title, a sidebar with model selection, and a password field for your Hugging Face API key.
You’ll develop the app interactively inside a Jupyter or JupyterLab workspace and view it live in a browser tab using a URL generated by Domino.
Create the Domino Project
First, create the Domino Project you’ll use for your app:
-
Go to your Domino home page, or your Projects page, and choose Develop > Projects > Create Project.
-
Name your project something like
streamlit-llm-chatbot
and click Create. -
In the project sidebar, go to Settings > Compute environment and select the environment you just created.
Create the app title and sidebar
Next, create the app title and sidebar for the app:
-
Launch a new Jupyter or JupyterLab workspace.
-
In the
/mnt
directory, create a new file namedchatbot.py
. -
Add the starter code found here in Domino’s example streamlit LLM app repo.
-
Update the following values in your file:
-
DOMINO_ENDPOINT_URL
: set to your Domino-hosted model endpoint -
DOMINO_MODEL_ACCESS_TOKEN
: your model’s access token
-
More information about Domino endpoints and tokens can be found in the Select Domino endpoint authorization mode documentation.
Get your app URL
To preview your app, construct the URL based on your Domino run context.
-
Replace
//your-domino-url/
with your actual Domino domain and run:echo -e "import os\nprint('https://your-domino-url/{}/{}/notebookSession/{}/proxy/8501/'.format(os.environ['DOMINO_PROJECT_OWNER'], os.environ['DOMINO_PROJECT_NAME'], os.environ['DOMINO_RUN_ID']))" | python3
Reminder: Port
8501
is Streamlit’s default. You’ll remap this to8888
when publishing. -
Run the app in your workspace:
streamlit run chatbot.py
-
Open the app in a browser tab using the generated URL.
You now have a live app preview with a sidebar and model selector ready for user interaction and backend logic.
Now that the UI is in place, you’ll add logic to handle chat input, store conversation history, and route model calls based on user selection.
The app will support both a Hugging Face-hosted model and a Domino-hosted endpoint. This step uses Streamlit session state to maintain chat history across user inputs.
Define model query functions
This app already includes logic to:
-
Call Hugging Face-hosted models via InferenceClient.
-
Fallback to direct HTTP calls if needed.
-
Call your Domino-hosted model endpoint.
All function definitions are in the file, including:
-
chat_with_hf(…)
-
chat_with_api_direct(…)
-
query_domino_endpoint(…)
Capture user input and generate a response
User prompts are captured using st.chat_input
, and responses are routed depending on the selected model. The logic makes sure that:
-
Messages are appended to the session state
-
Responses are streamed with a spinner
-
Fallbacks are handled gracefully
The chatbot.py
file has the full implementation details.
Sync all changes
Once your file is complete and tested interactively in your workspace, you’ll need to sync all changes.
-
Save all files, including
chatbot.py
and any updates to tokes or environment values. -
If you’re ready to move on from development, stop your workspace. This makes sure Domino syncs all file changes back to your project.
-
Confirm that both files -
chatbot.py
andapp.sh
- exist in your project root.
Your app is now fully interactive, capturing input, maintaining history, and calling different LLMs based on user selection.
To publish your Streamlit app in Domino, you need a startup script that tells Domino how to launch the app and configure it to use the correct host and port.
-
Problem: Streamlit defaults to running on port
8501
, but Domino requires apps to listen on0.0.0.0:8888
. -
Solution: Override the default settings using a
config.toml
file inside yourapp.sh
startup script.
Create the app file
In the root of your project, create a file named app.sh with the following content:
#!/bin/bash
mkdir -p ~/.streamlit
cat << EOF > ~/.streamlit/config.toml
[browser]
gatherUsageStats = true
[server]
port = 8888
enableCORS = false
enableXsrfProtection = false
address = "0.0.0.0"
EOF
streamlit run chatbot.py
Verify: Make sure the file is executable by running chmod +x app.sh
in the terminal from inside your workspace, if needed.
Why this works
-
The
config.toml
file tells Streamlit to use the correct port (8888
) and host (0.0.0.0
), which is required for apps published in Domino. -
Domino will run
app.sh
automatically when launching your app with no manual intervention needed.
You’re now ready to publish the chatbot as a live Domino app.
Now that your app and startup script are in place, it’s time to publish your Streamlit chatbot using Domino’s built-in app publishing feature.
Domino apps run inside containers based on your project’s environment and are launched by executing the app.sh
script you configured in the previous step.
Set access permissions
Choose how others can interact with your app:
-
Set the permissions to Anyone can access to allow other users on your Domino deployment to view it.
-
Leave Make globally discoverable selected to let others browse and find it from the Deploy > Apps view.
You can change these settings later if needed.
View the app
To see your fully interactive chatbot with model selection, live input, and responses from either the Hugging Face-hosted or Domino-hosted model:
-
Click Publish Domino App to deploy the app.
-
Once the app status changes to Running, click View App to open it in a new browser tab.
Your app is now live, discoverable, and running in Domino, accessible to others by URL or from the Launchpad.
Once your chatbot is running, you can easily share it with colleagues or test public access to ensure everything works as expected.
Access depends on the permissions you set during publishing. The Anyone can access permission allows any authenticated user on your Domino deployment to view the app through its URL.
-
Apps in Domino gives an overview of how apps work within the Domino ecosystem.
-
Create and publish an app has instructions on creating and publishing your apps, customizing the app’s URL, and sharing apps with authorized users.
-
Learn more about how apps run in Domino and what identity and permissions are used.