Autoscaling is a core mechanism for aligning compute supply with workload demand, yet it's often underutilized or misconfigured. In older clusters or ad-hoc environments, autoscaling may be disabled by default or set with tight min/max worker limits that prevent scaling. This can lead to persistent overprovisioning (and wasted cost during idle periods) or underperformance due to insufficient parallelism and job queuing. Poor autoscaling settings are especially common in manually created all-purpose clusters, where idle resources often go unnoticed.
Overly wide autoscaling ranges can also introduce instability: Databricks may rapidly scale up to the upper limit if demand briefly spikes, leading to cost spikes or degraded performance. Understanding workload characteristics is key to tuning autoscaling appropriately.
Databricks charges based on Databricks Units (DBUs) per node per hour. When autoscaling is misconfigured — such as by fixing the worker count (Min \= Max), setting narrow ranges, or disabling it entirely — clusters may remain overprovisioned during idle periods or underprovisioned during peak demand, resulting in inefficient compute spend or job failures.