Compute

Stale Completed or Failed Fargate Pods Causing Direct Billing and Capacity Waste

Compute

Cloud Provider

AWS

Service Name

AWS EKS

Inefficiency Type

Unnecessary compute and networking charges

This inefficiency occurs when Kubernetes Jobs or CronJobs running on EKS Fargate leave completed or failed pod objects in the cluster indefinitely. Although the workload execution has finished, AWS keeps the underlying Fargate microVM running to allow log inspection and final status checks. As a result, vCPU, memory, and networking resources remain allocated and billable until the pod object is explicitly deleted.

Over time, large numbers of stale Job pods can generate direct compute charges as well as consume ENIs and IP addresses, leading to both unnecessary spend and capacity pressure. This pattern is common in batch-processing and scheduled workloads that lack automated cleanup.

Learn more

Missed Use of Committed Use Discounts for Compute Engine

Compute

Cloud Provider

GCP

Service Name

GCP Compute Engine

Inefficiency Type

Suboptimal pricing model selection

This inefficiency occurs when workloads with predictable, long-running compute usage continue to run entirely on on-demand pricing instead of leveraging Committed Use Discounts. For stable environments, such as production services or continuously running batch workloads, failing to apply CUDs results in materially higher compute spend without any operational benefit. The inefficiency is driven by pricing choice, not resource overuse.

Learn more

Mixing Production and Non-Production Applications in the Same App Service Plan

Compute

Cloud Provider

Azure

Service Name

Azure App Service Plans

Inefficiency Type

Inefficient environment isolation

This inefficiency occurs when production and non-production applications are hosted within the same App Service Plan. Production workloads often require higher availability, performance, or scaling characteristics, driving the plan toward larger or higher-cost SKUs. When non-production workloads share that plan, they inherit the higher cost structure even though their availability and performance requirements are typically much lower, resulting in unnecessary spend.

Learn more

Fargate Resource Rounding and Per-Pod Overhead Driving Step-Up Costs

Compute

Cloud Provider

AWS

Service Name

AWS EKS

Inefficiency Type

Suboptimal resource sizing

This inefficiency occurs when pod resource requests—often inflated by sidecar containers—push total memory or CPU just over a Fargate sizing boundary. Because Fargate adds mandatory system overhead and only supports fixed resource combinations, small incremental increases can force a pod into a much larger billing tier. This results in materially higher cost for marginal additional resource needs, especially in workloads that run continuously or at scale.

Learn more

Unnecessary Lambda Provisioned Concurrency on Low-Utilization Functions

Compute

Cloud Provider

AWS

Service Name

AWS Lambda

Inefficiency Type

Unused reserved capacity

This inefficiency occurs when Provisioned Concurrency is enabled for Lambda functions that do not require consistently low latency or steady traffic. In such cases, reserved capacity remains allocated and billed during idle periods, creating ongoing cost without proportional performance or business benefit. This is distinct from standard Lambda execution charges, which are purely usage-based.

Learn more

Underutilized Azure Savings Plan Due to Overly Narrow Scope

Compute

Cloud Provider

Azure

Service Name

Azure Virtual Machines

Inefficiency Type

Commitment underutilization due to scope configuration

This inefficiency occurs when an Azure Savings Plan is scoped too narrowly relative to where eligible compute usage actually runs. When usage is spread across multiple subscriptions or fluctuates significantly (for example, development and test workloads that are frequently stopped and started), a narrowly scoped Savings Plan may not consistently find enough eligible usage to consume the full commitment. As a result, part of the committed hourly spend goes unused while other eligible workloads outside the scope continue to incur on-demand charges.

Azure supports broader scoping options—such as Management Group or Shared scope—that allow the commitment to be applied across a larger pool of eligible compute. Selecting an overly restrictive scope can therefore directly drive underutilization, even when sufficient total usage exists across the tenant.

Learn more

Suboptimal Integration Runtime Region Selection in Azure Data Factory

Compute

Cloud Provider

Azure

Service Name

Azure Data Factory V2

Inefficiency Type

Cross-Region Data Movement

When Integration Runtimes are configured with the default “Auto Resolve” region setting, Azure may automatically provision them in a region different from the data sources or sinks. For example, an environment deployed in West Europe may run pipelines in US East. This causes unnecessary cross-region data transfer, increasing networking costs and pipeline latency. The inefficiency often goes unnoticed because data transfer costs are billed separately from pipeline compute charges.

Learn more

Outdated AWS Glue Version for Python Jobs

Compute

Cloud Provider

AWS

Service Name

AWS Glue

Inefficiency Type

Outdated Runtime Version

Newer AWS Glue versions—such as Glue 5.0—include significant performance optimizations for **Python-based** ETL jobs, often reducing runtime by 10–60%. These improvements do not require any code changes, making version upgrades a simple and impactful optimization. When jobs remain on older runtimes such as Glue 3.0 or 4.0, they execute more slowly, consume more DPUs, and incur unnecessary cost. Additionally, Glue 5.0 offers more worker types (larger standard workers and memory-optimized workers), that can provide additional performance gain for some jobs. This inefficiency does not apply to Scala-based jobs, which do not benefit from the same performance uplift.

Learn more

Azure Hybrid Benefit Not Enabled on Virtual Machines

Compute

Cloud Provider

Azure

Service Name

Azure Virtual Machines

Inefficiency Type

Licensing Configuration Gap

Many organizations purchase Software Assurance or subscription-based Windows and SQL Server licenses that entitle them to use Azure Hybrid Benefit. However, if the setting is not applied on eligible resources, Azure continues charging pay-as-you-go rates that already include Microsoft licensing costs. This oversight results in paying twice—once for the on-premises license and once for the built-in Azure license. The inefficiency often goes unnoticed because licensing configurations are not centrally validated or enforced. Enabling AHUB can reduce costs by up to 40% for Windows server VMs and up to 30% for SQL Databases.

Learn more

Idle Dataflow Workers Running After Pipeline Failure

Compute

Cloud Provider

GCP

Service Name

GCP Dataflow

Inefficiency Type

Unreleased Compute Resources After Failure

When a Dataflow pipeline fails—often due to dependency issues, misconfigurations, or data format mismatches—its worker instances may remain active temporarily until the service terminates them. In some cases, misconfigured jobs, stuck retries, or delayed monitoring can cause workers to continue running for extended periods. These idle workers consume vCPU, memory, and storage resources without performing useful work. The inefficiency is compounded in large or high-frequency batch environments where repeated failures can leave many orphaned workers running concurrently.

Learn more

There are no inefficiency matches the current filters.