Use Spark on Domino

Apache Spark is a fast and general-purpose cluster computing system that offers a unified analytics engine for large-scale data processing and machine learning.

Domino provides flexibility on how to use Spark. You can dynamically provision an on-demand Spark cluster orchestrated by Domino or you can connect to an existing Spark cluster outside of Domino.

Hadoop and Spark

Domino projects can use the environment to work with Hadoop applications.

On-demand Spark

Use Domino to dynamically provision and orchestrate a Spark cluster directly on the infrastructure that backs the Domino instance.

Use spot instances

Spot instances reduce compute costs by using discounted cloud capacity. They’re ideal for fault-tolerant, stateless, or short-lived workloads like batch jobs, distributed training runs, and parallel analytics tasks.