Many organizations retain all logs in Cloud Logging’s standard storage, even when the data is rarely queried or required only for audit or compliance. Logging buckets are priced for active access and are not optimized for low-frequency retrievas, results in unnecessary expense. Redirecting logs to BigQuery or Cloud Storage can provide better cost efficiency, particularly when coupled with lifecycle policies or table partitioning. Choosing the optimal storage destination based on access frequency and analytics needs is essential to control log retention costs.
Some GCP services and workloads generate INFO-level logs at very high frequencies — for example, load balancers logging every HTTP request or GKE nodes logging system health messages. While valuable for debugging, these logs can flood Cloud Logging with non-critical data. Without log-level tuning or exclusion filters, organizations incur continuous ingestion charges for messages that are seldom analyzed. Over time, this behavior compounds into a persistent waste driver across large-scale environments.
Non-production environments frequently generate INFO-level logs that capture expected system behavior or routine API calls. While useful for troubleshooting in development, they rarely need to be retained. Allowing all INFO logs to be ingested and stored in Logging buckets across dev or staging environments can lead to disproportionate ingestion and storage costs. This inefficiency often persists because log routing and severity filters are not differentiated between production and non-production projects.
Duplicate log storage occurs when multiple sinks capture the same log data — for example, organization-wide sinks exporting all logs to Cloud Storage and project-level sinks doing the same. This redundancy results in paying twice (or more) for identical data. It often arises from decentralized logging configurations, inherited policies, or unclear ownership between teams. The problem is compounded when logs are routed both to Cloud Logging and external observability platforms, creating parallel ingestion streams and double billing.
Spot Instances are designed to be short-lived, with frequent interruptions and replacements. When AWS Config continuously records every lifecycle change for these instances, it produces a large number of CIRs. This drives costs significantly higher without delivering meaningful compliance insight, since Spot Instances are typically stateless and non-critical. In environments with heavy Spot usage, Config costs can balloon and exceed the value of tracking these transient resources.
By default, AWS Config is enabled in continuous recording mode. While this may be justified for production workloads where detailed auditability is critical, it is rarely necessary in non-production environments. Frequent changes in development or testing environments — such as redeploying Lambda functions, ECS tasks, or EC2 instances — generate large volumes of CIRs. This results in disproportionately high costs with minimal benefit to governance or compliance. Switching non-production environments to daily recording reduces CIR volume significantly while maintaining sufficient visibility for tracking changes.
Many organizations keep Datadog’s default log retention settings without evaluating business requirements. Defaults may extend retention far beyond what is useful for troubleshooting, performance monitoring, or compliance. This leads to unnecessary storage and indexing costs, particularly in non-production environments or for logs with limited value after a short period. By adjusting retention per project, environment, or service, organizations can reduce spend while still meeting compliance and operational needs.
AWS Graviton processors are designed to deliver better price-performance than comparable Intel-based instances, often reducing cost by 20–30% at equivalent workload performance. OpenSearch domains running on older Intel-based families consume more spend without providing additional capability. Since Graviton-powered instance types are functionally identical in features and performance for OpenSearch, continuing to run on Intel-based clusters represents unnecessary inefficiency.
When multiple tasks within a workflow are executed on separate job clusters — despite having similar compute requirements — organizations incur unnecessary overhead. Each cluster must initialize independently, adding latency and cost. This results in inefficient resource usage, especially for workflows that could reuse the same cluster across tasks. Consolidating tasks onto a single job cluster where feasible reduces start-up time and avoids duplicative compute charges.
Changing a Google Cloud billing account can unintentionally break existing Marketplace subscriptions. If entitlements are tied to the original billing account, the subscription may fail or become invalid, prompting teams to make urgent, direct purchases of the same services, often at higher list or on-demand rates. These emergency purchases bypass previously negotiated Marketplace pricing and can result in significantly higher short-term costs. The issue is common during reorganizations, mergers, or changes to billing hierarchy and is often not discovered until after costs have spiked.