Publish a Streamlit LLM app in Domino

This guide walks you through building and deploying a chatbot powered by large language models (LLMs) using the Streamlit framework in Domino.

The app combines a couple of models hosted on Hugging Face with a model served via a Domino endpoint through a single, interactive UI.

What you’ll do

  • Set up API access and compute environment prerequisites

  • Use Domino’s example code to build a Streamlit chatbot with selectable model backends

  • Integrate Hugging Face and Domino endpoints

  • Configure app startup and publish it in Domino

  • Share the app with others in your organization

Prerequisites

Before you begin, make sure you have the following in place:

  • Hugging Face API key
    Get a Hugging Face API key to access models hosted on Hugging Face:

    1. Create your account on Hugging Face.

    2. Get your Access token from Hugging Face.

    3. In Domino, go to Account Settings > Environment Variables and add a user-level variable:

      1. Name: HUGGING_FACE_API_TOKEN

      2. Value: your key

  • Domino compute environment
    If you’re using a Domino Standard Environment (DSE), these are already included. Your environment must include:

    1. Jupyter or JupyterLab

    2. The jupyter-server-proxy package

    3. Streamlit and necessary Python libraries

  • Domino endpoint
    To demonstrate multi-model access, this guide uses:

    1. Hugging Face-hosted models, such as Llama 3.2 3B and Mistral 7B.

    2. A model deployed as a Domino-hosted prediction endpoint. It is assumed you already have a Domino endpoint configured and available.

Step 1: Set up the environment

To run and publish your Streamlit chatbot, first configure a Domino environment with the required packages. Streamlit apps are launched through a streamlit run and are hosted at port 8501 by default. You’ll configure this to use Domino’s required port 8888 later.

  1. Go to Environments > Create Environment.

  2. Enter a name and description.

  3. Select a base image, such as the Domino Standard Environment.

  4. In the Dockerfile, add the following if not already present:

    USER root
    RUN pip install --no-cache-dir \
        streamlit \
        jupyter-server-proxy \
        requests \
        pandas
    USER ubuntu
    RUN pip install transformers torch streamlit
  5. Click Build and wait for the environment to finish building.

You now have a compute environment with Streamlit and proxy support, ready to run your chatbot app.

Step 2: Create the Project and build the App UI

Next, you’ll create a new Domino project and begin building your chatbot app using Streamlit. You’ll start with a basic layout: a title, a sidebar with model selection, and a password field for your Hugging Face API key.

You’ll develop the app interactively inside a Jupyter or JupyterLab workspace and view it live in a browser tab using a URL generated by Domino.

Create the Domino Project

First, create the Domino Project you’ll use for your app:

  1. Go to your Domino home page, or your Projects page, and choose Develop > Projects > Create Project.

  2. Name your project something like streamlit-llm-chatbot and click Create.

  3. In the project sidebar, go to Settings > Compute environment and select the environment you just created.

Create the app title and sidebar

Next, create the app title and sidebar for the app:

  1. Launch a new Jupyter or JupyterLab workspace.

  2. In the /mnt directory, create a new file named chatbot.py.

  3. Add the starter code found here in Domino’s example streamlit LLM app repo.

  4. Update the following values in your file:

    1. DOMINO_ENDPOINT_URL: set to your Domino-hosted model endpoint

    2. DOMINO_MODEL_ACCESS_TOKEN: your model’s access token

More information about Domino endpoints and tokens can be found in the Select Domino endpoint authorization mode documentation.

Get your app URL

To preview your app, construct the URL based on your Domino run context.

  1. Replace //your-domino-url/ with your actual Domino domain and run:

    echo -e "import os\nprint('https://your-domino-url/{}/{}/notebookSession/{}/proxy/8501/'.format(os.environ['DOMINO_PROJECT_OWNER'], os.environ['DOMINO_PROJECT_NAME'], os.environ['DOMINO_RUN_ID']))" | python3

    Reminder: Port 8501 is Streamlit’s default. You’ll remap this to 8888 when publishing.

  2. Run the app in your workspace:

    streamlit run chatbot.py
  3. Open the app in a browser tab using the generated URL.

You now have a live app preview with a sidebar and model selector ready for user interaction and backend logic.

Step 3: Add model logic and user interaction

Now that the UI is in place, you’ll add logic to handle chat input, store conversation history, and route model calls based on user selection.

The app will support both a Hugging Face-hosted model and a Domino-hosted endpoint. This step uses Streamlit session state to maintain chat history across user inputs.

Store chat messages in session state

This logic is already included in the chatbot.py code. It initializes and displays the conversation history.

Define model query functions

This app already includes logic to:

  • Call Hugging Face-hosted models via InferenceClient.

  • Fallback to direct HTTP calls if needed.

  • Call your Domino-hosted model endpoint.

All function definitions are in the file, including:

  • chat_with_hf(…​)

  • chat_with_api_direct(…​)

  • query_domino_endpoint(…​)

Capture user input and generate a response

User prompts are captured using st.chat_input, and responses are routed depending on the selected model. The logic makes sure that:

  • Messages are appended to the session state

  • Responses are streamed with a spinner

  • Fallbacks are handled gracefully

The chatbot.py file has the full implementation details.

Sync all changes

Once your file is complete and tested interactively in your workspace, you’ll need to sync all changes.

  1. Save all files, including chatbot.py and any updates to tokes or environment values.

  2. If you’re ready to move on from development, stop your workspace. This makes sure Domino syncs all file changes back to your project.

  3. Confirm that both files - chatbot.py and app.sh - exist in your project root.

Your app is now fully interactive, capturing input, maintaining history, and calling different LLMs based on user selection.

Step 4: Configure the app file and prepare for publishing

To publish your Streamlit app in Domino, you need a startup script that tells Domino how to launch the app and configure it to use the correct host and port.

  • Problem: Streamlit defaults to running on port 8501, but Domino requires apps to listen on 0.0.0.0:8888.

  • Solution: Override the default settings using a config.toml file inside your app.sh startup script.

Create the app file

In the root of your project, create a file named app.sh with the following content:

#!/bin/bash

mkdir -p ~/.streamlit

cat << EOF > ~/.streamlit/config.toml
[browser]
gatherUsageStats = true

[server]
port = 8888
enableCORS = false
enableXsrfProtection = false
address = "0.0.0.0"
EOF

streamlit run chatbot.py

Verify: Make sure the file is executable by running chmod +x app.sh in the terminal from inside your workspace, if needed.

Why this works

  • The config.toml file tells Streamlit to use the correct port (8888) and host (0.0.0.0), which is required for apps published in Domino.

  • Domino will run app.sh automatically when launching your app with no manual intervention needed.

You’re now ready to publish the chatbot as a live Domino app.

Step 5: Publish the app

Now that your app and startup script are in place, it’s time to publish your Streamlit chatbot using Domino’s built-in app publishing feature.

Domino apps run inside containers based on your project’s environment and are launched by executing the app.sh script you configured in the previous step.

Open the publishing interface

  • In your project sidebar, go to Deployments > App.

  • Fill in an informative title and description, for example, LLM Chatbot with Hugging Face and Domino Endpoints.

Set access permissions

Choose how others can interact with your app:

  • Set the permissions to Anyone can access to allow other users on your Domino deployment to view it.

  • Leave Make globally discoverable selected to let others browse and find it from the Deploy > Apps view.

You can change these settings later if needed.

View the app

To see your fully interactive chatbot with model selection, live input, and responses from either the Hugging Face-hosted or Domino-hosted model:

  • Click Publish Domino App to deploy the app.

  • Once the app status changes to Running, click View App to open it in a new browser tab.

Your app is now live, discoverable, and running in Domino, accessible to others by URL or from the Launchpad.

Step 6: Share the app

Once your chatbot is running, you can easily share it with colleagues or test public access to ensure everything works as expected.

Access depends on the permissions you set during publishing. The Anyone can access permission allows any authenticated user on your Domino deployment to view the app through its URL.

Copy the app link

  • From the Deployments > App screen, click Copy App Link.

  • Share the link with others who have access to your Domino instance.

Test access

Open the app URL in an incognito or logged-out browser window to confirm access settings.

If you’re prompted for a Hugging Face token, enter it in the sidebar to test API access as a new user.

Browse or discover more apps

  • From the top nav, go to Deploy > Apps to explore other published tools within your organization.

You’ve now successfully published and shared an LLM-powered Streamlit chatbot with interactive inputs and backend model integration.

Next steps