Cloud Provider
Databricks Compute
Inefficiency Type
Clear filters
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Showing
1234
out of
1234
inefficiencis
Filter
:
Filter
x
Lack of Workload-Specific Cluster Segmentation
Compute
Cloud Provider
Databricks
Service Name
Databricks Compute
Inefficiency Type
Inefficient Configuration

Running varied workload types (e.g., ETL pipelines, ML training, SQL dashboards) on the same cluster introduces inefficiencies. Each workload has different runtime characteristics, scaling needs, and performance sensitivities. When mixed together, resource contention can degrade job performance, increase cost, and obscure cost attribution.

ETL jobs may overprovision memory, while lightweight SQL queries may trigger unnecessary cluster scale-ups. Job failures or retries may increase due to contention, and queued jobs can further inflate runtime costs. Without clear segmentation, teams lose the ability to tune environments for specific use cases or monitor workload-specific efficiency.

Poorly Configured Autoscaling on Databricks Clusters
Compute
Cloud Provider
Databricks
Service Name
Databricks Compute
Inefficiency Type
Inefficient Configuration

Autoscaling is a core mechanism for aligning compute supply with workload demand, yet it's often underutilized or misconfigured. In older clusters or ad-hoc environments, autoscaling may be disabled by default or set with tight min/max worker limits that prevent scaling. This can lead to persistent overprovisioning (and wasted cost during idle periods) or underperformance due to insufficient parallelism and job queuing. Poor autoscaling settings are especially common in manually created all-purpose clusters, where idle resources often go unnoticed.

Overly wide autoscaling ranges can also introduce instability: Databricks may rapidly scale up to the upper limit if demand briefly spikes, leading to cost spikes or degraded performance. Understanding workload characteristics is key to tuning autoscaling appropriately.

Overuse of Photon in Non-Production Workloads
Compute
Cloud Provider
Databricks
Service Name
Databricks Compute
Inefficiency Type
Inefficient Configuration

Photon is frequently enabled by default across Databricks workspaces, including for development, testing, and low-concurrency workloads. In these non-production contexts, job runtimes are typically shorter, SLAs are relaxed or nonexistent, and performance gains offer little business value.

Enabling Photon in these environments can inflate DBU costs substantially without meaningful runtime improvements. By not differentiating cluster configurations between production and non-production, organizations may pay a premium for workloads that could run just as efficiently on standard compute.

Cluster policies can be used to restrict Photon usage to explicitly tagged production workloads, helping enforce cost-conscious defaults and reduce unnecessary spend.

There are no inefficiency matches the current filters.