In non-production environments—such as development, testing, and experimentation—many teams default to on-demand nodes out of habit or caution. However, Databricks offers built-in support for using spot instances safely. Its job scheduler and cluster management system are designed to detect spot instance evictions and automatically replace them with on-demand nodes when necessary, making the use of spot compute relatively low-risk.
Failing to enable spot for non-critical or short-lived workloads leads to unnecessary overspend. The inefficiency often arises because spot usage is not enabled by default and must be explicitly selected in cluster settings. In teams that don’t revisit infrastructure defaults or where FinOps guardrails are missing, this results in a persistent cost gap between actual usage and what could be safely optimized.
Databricks clusters are billed based on the underlying virtual machines used for driver and worker nodes. When on-demand instances are selected, charges are based on standard cloud provider rates. If spot instances are enabled (where available), compute costs can be significantly lower—often 60–90% cheaper. Databricks includes native failover capabilities that automatically replace preempted spot nodes with on-demand nodes to maintain job continuity, minimizing the impact of eviction risk.