Many environments continue using io1 volumes for high-performance workloads due to legacy provisioning or lack of awareness of io2 benefits. io2 volumes provide equivalent or better performance and durability with reduced cost at scale. Failing to adopt io2 where appropriate results in unnecessary spend on IOPS-heavy volumes.
When large numbers of objects are deleted from S3—such as during cleanup or lifecycle transitions—CloudTrail can log every individual delete operation if data event logging is enabled. This is especially costly when deleting millions of objects from buckets configured with CloudTrail data event logging at the object level. The resulting volume of logs can cause a significant, unexpected spike in CloudTrail charges, sometimes exceeding the cost of the underlying S3 operations themselves. This inefficiency often occurs when teams initiate bulk deletions for cleanup or cost savings without realizing that CloudTrail logs every API call, including `DeleteObject`, if data event logging is active for the bucket.
Non-production OpenSearch domains often inherit Multi-AZ configurations from production setups without clear justification. This leads to redundant replica shards across AZs, inflating both compute and storage costs. Unless strict uptime or fault tolerance requirements exist, most dev/test workloads do not benefit from Multi-AZ redundancy.
Recursive invocation occurs when a Lambda function triggers itself directly or indirectly, often through an event source like SQS, SNS, or another Lambda. This loop can be unintentional — for example, when the function writes output to a queue it also consumes. Without controls, this can lead to runaway invocations, dramatically increasing cost with no business value.
In non-production environments, enabling Multi-AZ Redis clusters introduces redundant replicas that may not deliver meaningful business value. These replicas are often kept in sync across Availability Zones, incurring both compute and inter-AZ data transfer costs. For development or test clusters that can tolerate occasional downtime or data loss, a single-AZ deployment is typically sufficient and significantly less expensive.
RDS Multi-AZ deployments are designed for production-grade fault tolerance. In non-production environments, this configuration doubles the cost of database instances and storage with little added value. Unless explicitly required for high-availability testing, Multi-AZ in dev, staging, or test environments typically results in avoidable expense.
Multi-AZ deployment is often essential for production workloads, but its use in non-production environments (e.g., development, test, QA) offers minimal value. These environments typically do not require high availability, yet still incur the full cost of redundant compute, storage, and data transfer. This results in unnecessary spend without operational benefit.
When a Lambda function processes messages from an SQS queue but fails to handle certain messages properly, the same messages may be returned to the queue and retried repeatedly. In some cases, especially if the Lambda is also writing messages back to the same or a chained queue, this can create a recursive invocation loop. This loop results in high invocation counts, prolonged execution, and unnecessary costs, particularly if retries continue without a termination strategy.
RDS reader nodes are intended to handle read-only workloads, allowing for traffic offloading from the primary (writer) node. However, in many environments, services are misconfigured or hardcoded to send all traffic—including reads—to the writer node. This results in underutilized reader nodes that still incur full hourly charges, while the writer node becomes a performance bottleneck and may require upsizing to handle unnecessary load. This inefficiency reduces cost-effectiveness and resilience, especially in high-throughput or scalable architectures.
Provisioned load balancers continue to generate costs even when they are no longer serving meaningful traffic. This often occurs when applications are decommissioned, testing infrastructure is left behind, or backend services are removed without deleting the associated frontend configurations. Without ingress or egress traffic, these load balancers offer no functional value but still consume billable resources, including forwarding rules and reserved external IPs.