Domino Apps can automatically scale based on resource usage to balance cost with performance. Autoscaling prevents performance loss during periods of high demand while avoiding unnecessary compute costs during low demand.
Domino uses the Kubernetes Horizontal Pod Autoscaler (HPA) v2 to manage scaling.
You decide whether to enable autoscaling when you publish or launch an App. Autoscaling is disabled by default.
When enabled, HPA dynamically and seamlessly adjusts the number of Kubernetes pods for the App while it is running. Scaling occurs without requiring you to restart or republish the App.
Scaling parameters
Autoscaling decisions are based on the following scaling parameters:
| Scaling parameter | Description |
|---|---|
Maximum Pods | Maximum number of pods the App can scale up to. |
CPU % target | Target CPU utilization across pods. |
Memory % target | Target memory utilization across pods. |
Scale up delay or Scale down delay | Delays scaling changes to prevent rapid, repeated adjustments. |
Enable session affinity | Routes user requests to the same pod. |
Session affinity
Session affinity routes all of a viewer’s requests to the same pod for the duration of their session. Without it, traffic may be distributed across multiple pods, which can disrupt apps that maintain in-memory state or require continuity across requests.
-
Enable session affinity for frameworks like R/Shiny that rely on stateful sessions.
-
Leave session affinity disabled for most other frameworks to improve responsiveness and scale-down efficiency.
Different frameworks benefit from different autoscaling settings. Use these guidelines to balance performance and cost:
-
Rely on HPA to avoid thrashing, or rapid oscillation between scale events.
-
Remember that autoscaling apps always run at least one pod.
-
Scale-up typically finishes within ~20 seconds. This lets you safely select leaner, lower-cost hardware tiers than for non-autoscaling Apps.
-
Use the default autoscaling settings for most frameworks.
-
Select smaller hardware tiers and let the autoscaler scale up or down as needed.
-
Keep the default autoscaling settings for most frameworks.
R/Shiny
R/Shiny apps have unique scaling needs because they are single-threaded. Use these settings to maximize stability and support high user counts:
-
Single-threaded, so horizontal autoscaling is essential.
-
Enable session affinity to avoid transient connectivity issues for viewers.
-
Use smaller hardware tiers with a higher number of maximum pods.
-
Choose a low-CPU instance type that can be aggressively scaled horizontally.
-
These settings support arbitrarily high user counts.
-
Persist Data: Save and manage app data across sessions.
-
Usage and Resource Monitoring: Monitor app workloads.
