Storing raw JSON or CSV files in S3—especially when written frequently in small batches—leads to excessive scan costs in Athena. These formats are row-based and verbose, requiring Athena to scan and parse the full content even when only a few fields are queried. Without columnar formats, partitioning, or metadata-aware table formats, queries become inefficient and expensive, especially in high-volume environments.
An ELB with only one registered EC2 instance does not achieve its core purpose—distributing traffic across multiple backends. In this configuration, the ELB adds complexity and cost without improving availability, scalability, or fault tolerance. This setup is often the result of premature scaling design or misunderstood architecture patterns. If there's no plan to horizontally scale the application, the ELB can often be removed entirely without user impact.
EBS volumes often remain significantly overprovisioned compared to the actual data stored on them. Because billing is based on the total provisioned capacity—not actual usage—this creates ongoing waste when large volumes are only partially used. Overprovisioning may result from default sizing in templates, misestimated requirements, or conservative provisioning practices. Identifying and remediating these cases can lead to meaningful storage cost reductions without impacting workload performance.
Photon is enabled by default on many Databricks compute configurations. While it can accelerate certain SQL and DataFrame operations, its performance benefits are workload-specific and may not justify the increased DBU cost. Many pipelines, particularly ETL jobs or simpler Spark workloads, do not benefit materially from Photon but still incur the higher DBU multiplier. Disabling Photon by default and allowing it only where proven beneficial can reduce cost without degrading performance.
When fleet auto scaling policies maintain more active instances than are required to support current usage—particularly during off-peak hours—organizations incur unnecessary compute costs. Fleets often remain oversized due to conservative default configurations or lack of schedule-based scaling. Tuning the scaling policies to better reflect usage patterns ensures that streaming infrastructure aligns with actual demand.
AppStream fleets often default to instance types designed for worst-case or peak usage scenarios, even when average workloads are significantly lighter. This leads to consistently low utilization of CPU, memory, or GPU resources and inflated infrastructure costs. By right-sizing AppStream instances based on actual workload needs, organizations can reduce spend without compromising user experience.
When AppStream builder instances are left running but unused, they continue to generate compute charges without delivering any value. These instances are commonly left active after configuration or image creation is completed but can be safely stopped or terminated when not in use. Identifying and decommissioning inactive builders helps reduce unnecessary compute costs.
EBS Snapshot Archive is a lower-cost storage tier for rarely accessed snapshots retained for compliance, regulatory, or long-term backup purposes. Archiving snapshots that do not require frequent or fast retrieval can reduce snapshot storage costs by up to 75%. Despite this, many organizations retain large volumes of snapshots in the standard tier long after their operational value has expired.
When reservations are scoped only to a single subscription, any unused capacity cannot be applied to matching resources in other subscriptions within the same tenant. This leads to underutilization of the committed reservation and continued on-demand charges in other parts of the organization. Enabling **Shared scope** allows all eligible subscriptions to consume the reservation benefit, improving utilization and reducing overall spend. This is particularly impactful in environments with decentralized provisioning, such as across dev/test/prod subscriptions or multiple business units.
When Kubernetes workloads request more CPU and memory than they actually consume, nodes must reserve capacity that remains unused. This leads to lower node density, forcing the cluster to maintain more instances than necessary. Aligning resource requests with observed utilization improves cluster efficiency and reduces compute spend without sacrificing application performance.