Inefficiencies

Unoptimized Billing Model for BigQuery Dataset Storage

Databases

Cloud Provider

GCP

Service Name

GCP BigQuery

Inefficiency Type

Inefficient Configuration

Highly compressible datasets, such as those with repeated string fields, nested structures, or uniform rows, can benefit significantly from physical storage billing. Yet most datasets remain on logical storage by default, even when physical storage would reduce costs.

This inefficiency is common for cold or infrequently updated datasets that are no longer optimized or regularly reviewed. Because storage behavior and data characteristics evolve, failing to periodically reassess the billing model may result in persistent waste.

Learn more

Underuse of Serverless for Short or Interactive Workloads

Compute

Cloud Provider

Databricks

Service Name

Databricks SQL

Inefficiency Type

Inefficient Configuration

Many organizations continue running short-lived or low-intensity SQL workloads — such as dashboards, exploratory queries, and BI tool integrations — on traditional clusters. This leads to idle compute, overprovisioning, and high baseline costs, especially when the clusters are always-on. Databricks SQL Serverless is optimized for bursty, interactive use cases with auto-scaling and pay-per-second pricing, making it better suited for this class of workloads. Failing to migrate to serverless for these patterns results in unnecessary cost without performance benefit.

Learn more

Lack of Workload-Specific Cluster Segmentation

Compute

Cloud Provider

Databricks

Service Name

Databricks Compute

Inefficiency Type

Inefficient Configuration

Running varied workload types (e.g., ETL pipelines, ML training, SQL dashboards) on the same cluster introduces inefficiencies. Each workload has different runtime characteristics, scaling needs, and performance sensitivities. When mixed together, resource contention can degrade job performance, increase cost, and obscure cost attribution.

ETL jobs may overprovision memory, while lightweight SQL queries may trigger unnecessary cluster scale-ups. Job failures or retries may increase due to contention, and queued jobs can further inflate runtime costs. Without clear segmentation, teams lose the ability to tune environments for specific use cases or monitor workload-specific efficiency.

Learn more

Poorly Configured Autoscaling on Databricks Clusters

Compute

Cloud Provider

Databricks

Service Name

Databricks Compute

Inefficiency Type

Inefficient Configuration

Autoscaling is a core mechanism for aligning compute supply with workload demand, yet it's often underutilized or misconfigured. In older clusters or ad-hoc environments, autoscaling may be disabled by default or set with tight min/max worker limits that prevent scaling. This can lead to persistent overprovisioning (and wasted cost during idle periods) or underperformance due to insufficient parallelism and job queuing. Poor autoscaling settings are especially common in manually created all-purpose clusters, where idle resources often go unnoticed.

Overly wide autoscaling ranges can also introduce instability: Databricks may rapidly scale up to the upper limit if demand briefly spikes, leading to cost spikes or degraded performance. Understanding workload characteristics is key to tuning autoscaling appropriately.

Learn more

Overuse of Photon in Non-Production Workloads

Compute

Cloud Provider

Databricks

Service Name

Databricks Compute

Inefficiency Type

Inefficient Configuration

Photon is frequently enabled by default across Databricks workspaces, including for development, testing, and low-concurrency workloads. In these non-production contexts, job runtimes are typically shorter, SLAs are relaxed or nonexistent, and performance gains offer little business value.

Enabling Photon in these environments can inflate DBU costs substantially without meaningful runtime improvements. By not differentiating cluster configurations between production and non-production, organizations may pay a premium for workloads that could run just as efficiently on standard compute.

Cluster policies can be used to restrict Photon usage to explicitly tagged production workloads, helping enforce cost-conscious defaults and reduce unnecessary spend.

Learn more

Inefficient Query Design in Databricks SQL and Spark Jobs

Compute

Cloud Provider

Databricks

Service Name

Databricks SQL

Inefficiency Type

Inefficient Configuration

Many Spark and SQL workloads in Databricks suffer from micro-optimization issues — such as unfiltered joins, unnecessary shuffles, missing broadcast joins, and repeated scans of uncached data. These problems increase compute time and resource utilization, especially in exploratory or development environments. Disabling Adaptive Query Execution (AQE) can further exacerbate inefficiencies. Optimizing queries reduces DBU costs, improves cluster performance, and enhances user experience.

Learn more

Idle EMR Cluster Without Auto-Termination Policy

Compute

Cloud Provider

AWS

Service Name

AWS EMR

Inefficiency Type

Inactive Resource

Amazon EMR clusters often run on large, multi-node EC2 fleets, making them costly to leave running unnecessarily. If a cluster becomes idle—no longer processing jobs—but is not terminated, it continues accruing EC2 and EMR service charges. Many teams forget to shut down clusters manually or leave them running for debugging, staging, or future job use. Without an auto-termination policy, this oversight leads to significant unnecessary spend.

Learn more

Underuse of Fargate Spot for Interruptible Workloads

Compute

Cloud Provider

AWS

Service Name

AWS Fargate

Inefficiency Type

Pricing Model Misalignment

Many teams run workloads on standard Fargate pricing even when the workload is fault-tolerant and could tolerate interruptions. Fargate Spot provides the same performance characteristics at up to 70% lower cost, making it ideal for stateless jobs, batch processing, CI/CD runners, or retry-friendly microservices.

Learn more

Excessive Retention of Automated RDS Backups

Storage

Cloud Provider

AWS

Service Name

AWS RDS

Inefficiency Type

Retention

If backup retention settings are too high or old automated backups are unnecessarily retained, costs can accumulate rapidly. RDS backup storage is significantly more expensive than equivalent storage in S3. For long-term retention or compliance use cases, exporting backups to S3 (e.g., via snapshot export to Amazon S3 in Parquet) is often more cost-effective than retaining them in RDS-native format.

Learn more

Suboptimal Architecture Selection in AWS Lambda

Compute

Cloud Provider

AWS

Service Name

AWS Lambda

Inefficiency Type

Suboptimal Configuration

Lambda functions default to the x86\_64 architecture, which is more expensive than Arm64. For many workloads, especially those written in interpreted languages (e.g., Python, Node.js) or compiled to architecture-neutral bytecode (e.g., Java), there is no dependency on x86-specific binaries. In such cases, moving to Arm64 can reduce compute costs by 20% without impacting functionality. Despite this, many teams continue to run Lambda functions on x86\_64 due to legacy configurations, inertia, or lack of awareness. This leads to avoidable spending, particularly at scale or in high-volume environments.

Learn more

There are no inefficiency matches the current filters.