Cloud Provider
Service Name
Inefficiency Type
Clear filters
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Showing
1234
out of
1234
inefficiencies
Filter
:
Filter
x
Excessive Automated Backup Retention in Cloud SQL
Databases
Cloud Provider
GCP
Service Name
Cloud SQL
Inefficiency Type
Excessive Data Retention

This inefficiency occurs when automated Cloud SQL backups are retained longer than required by recovery objectives or governance needs. Because backups accumulate over the retention window (and can grow quickly for high-change databases), excessive retention drives ongoing backup storage charges without improving practical recoverability.

Mixing Production and Non-Production Applications in the Same App Service Plan
Compute
Cloud Provider
Azure
Service Name
Azure App Service Plans
Inefficiency Type
Inefficient environment isolation

This inefficiency occurs when production and non-production applications are hosted within the same App Service Plan. Production workloads often require higher availability, performance, or scaling characteristics, driving the plan toward larger or higher-cost SKUs. When non-production workloads share that plan, they inherit the higher cost structure even though their availability and performance requirements are typically much lower, resulting in unnecessary spend.

Fargate Resource Rounding and Per-Pod Overhead Driving Step-Up Costs
Compute
Cloud Provider
AWS
Service Name
AWS EKS
Inefficiency Type
Suboptimal resource sizing

This inefficiency occurs when pod resource requests—often inflated by sidecar containers—push total memory or CPU just over a Fargate sizing boundary. Because Fargate adds mandatory system overhead and only supports fixed resource combinations, small incremental increases can force a pod into a much larger billing tier. This results in materially higher cost for marginal additional resource needs, especially in workloads that run continuously or at scale.

Unnecessary Lambda Provisioned Concurrency on Low-Utilization Functions
Compute
Cloud Provider
AWS
Service Name
AWS Lambda
Inefficiency Type
Unused reserved capacity

This inefficiency occurs when Provisioned Concurrency is enabled for Lambda functions that do not require consistently low latency or steady traffic. In such cases, reserved capacity remains allocated and billed during idle periods, creating ongoing cost without proportional performance or business benefit. This is distinct from standard Lambda execution charges, which are purely usage-based.

Retained Azure Backup Data After Resource Decommissioning
Storage
Cloud Provider
Azure
Service Name
Azure Backup
Inefficiency Type
Orphaned backup data

This inefficiency occurs when a protected resource (such as a virtual machine, database, or file share) is decommissioned without explicitly stopping backup protection. In these cases, Azure Backup continues to retain existing recovery points in the vault until the retention policy expires. Although the source resource no longer exists, backup storage remains allocated and billable, resulting in unnecessary ongoing costs.

This pattern is common when infrastructure is deleted outside of a formal decommissioning process or when backup ownership is unclear.

Underutilized Azure Savings Plan Due to Overly Narrow Scope
Compute
Cloud Provider
Azure
Service Name
Azure Virtual Machines
Inefficiency Type
Commitment underutilization due to scope configuration

This inefficiency occurs when an Azure Savings Plan is scoped too narrowly relative to where eligible compute usage actually runs. When usage is spread across multiple subscriptions or fluctuates significantly (for example, development and test workloads that are frequently stopped and started), a narrowly scoped Savings Plan may not consistently find enough eligible usage to consume the full commitment. As a result, part of the committed hourly spend goes unused while other eligible workloads outside the scope continue to incur on-demand charges.

Azure supports broader scoping options—such as Management Group or Shared scope—that allow the commitment to be applied across a larger pool of eligible compute. Selecting an overly restrictive scope can therefore directly drive underutilization, even when sufficient total usage exists across the tenant.

Using High-Cost Models for Low-Complexity Tasks
AI
Cloud Provider
GCP
Service Name
GCP Vertex AI
Inefficiency Type
Overpowered Model Selection

Vertex AI workloads often include low-complexity tasks such as classification, routing, keyword extraction, metadata parsing, document triage, or summarization of short and simple text. These operations do **not** require the advanced multimodal reasoning or long-context capabilities of larger Gemini model tiers. When organizations default to a single high-end model (such as Gemini Ultra or Pro) across all applications, they incur elevated token costs for work that could be served efficiently by **Gemini Flash** or smaller task-optimized variants. This mismatch is a common pattern in early deployments where model selection is driven by convenience rather than workload-specific requirements. Over time, this creates unnecessary spend without delivering measurable value.

Using High-Cost Bedrock Models for Low-Complexity Tasks
AI
Cloud Provider
AWS
Service Name
AWS Bedrock
Inefficiency Type
Overpowered Model Selection

Many Bedrock workloads involve low-complexity tasks such as tagging, classification, routing, entity extraction, keyword detection, document triage, or lightweight summarization. These tasks **do not require** the advanced reasoning or generative capabilities of higher-cost models such as Claude 3 Opus or comparable premium models. When organizations default to a high-end model across all applications—or fail to periodically reassess model selection—they pay elevated costs for work that could be performed effectively by smaller, lower-cost models such as Claude Haiku or other compact model families. This inefficiency becomes more pronounced in high-volume, repetitive workloads where token counts scale quickly.

Always-On PTUs for Seasonal or Cyclical Azure OpenAI Workloads
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Unnecessary Continuous Provisioning

Many Azure OpenAI workloads—such as reporting pipelines, marketing workflows, batch inference jobs, or time-bound customer interactions—only run during specific periods. When PTUs remain fully provisioned 24/7, organizations incur continuous fixed cost even during extended idle time. Although Azure does not offer native PTU scheduling, teams can use automation to provision and deprovision PTUs based on predictable cycles. This allows them to retain performance during peak windows while reducing cost during low-activity periods.

Non-Production Azure OpenAI Deployments Using PTUs Instead of PAYG
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Misaligned Pricing Model

Development, testing, QA, and sandbox environments rarely have the steady, predictable traffic patterns needed to justify PTU deployments. These workloads often run intermittently, with lower throughput and shorter usage windows. When PTUs are assigned to such environments, the fixed hourly billing generates continuous cost with little utilization. Switching non-production workloads to PAYG aligns cost with actual usage and eliminates the overhead of managing PTU quota in low-stakes environments.

There are no inefficiency matches the current filters.