Submit feedback on
Suboptimal Use of On-Demand Instances in Non-Production Clusters
We've received your feedback.
Thanks for reaching out!
Oops! Something went wrong while submitting the form.
Close
Suboptimal Use of On-Demand Instances in Non-Production Clusters
Matt Weingarten
Service Category
Compute
Cloud Provider
Databricks
Service Name
Databricks Clusters
Inefficiency Type
Suboptimal Pricing Model
Explanation

In Databricks, on-demand instances provide reliable performance but come at a premium cost. For non-production workloads—such as development, testing, or exploratory analysis—high availability is often unnecessary. Spot instances provide equivalent performance at a lower price, with the tradeoff of occasional interruptions. If teams default to on-demand usage in lower environments, they may be incurring unnecessary compute costs. Using compute policies to limit on-demand usage ensures greater consistency and efficiency across environments.

Relevant Billing Model

Databricks charges for compute based on:

  • Databricks Units (DBUs): Varies by cluster configuration, including node type and pricing model (on-demand vs. Spot)
  • Cloud Infrastructure Charges: Passed through from the cloud provider and dependent on instance pricing model

On-demand nodes incur the highest cost. Spot instances offer significant discounts but may be interrupted and are best used for dev/test workloads.

Detection
  • Query system tables to identify non-production clusters with high or full on-demand usage
  • Review workspace and cluster policies to determine if Spot usage is being enforced
  • Confirm whether the workloads running on these clusters are tolerant to interruptions
  • Evaluate whether the use of on-demand instances is justified for each environment
Remediation
  • Implement compute policies that cap the percentage of on-demand nodes in relevant workloads
  • Update existing cluster configurations to prioritize Spot usage for dev/test workloads
  • Allow exceptions only when reliability or performance constraints are well documented
Relevant Documentation
  • Spot Instances in Databricks
  • Compute Policies Overview
Submit Feedback