Cloud Provider
AWS S3
Inefficiency Type
Clear filters
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Showing
1234
out of
1234
inefficiencis
Filter
:
Filter
x
S3 Standard - Infrequent Access Used Where Intelligent Tiering Would Be Cheaper
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Suboptimal Pricing Model

Organizations often use the Standard - Infrequent Access (Standard-IA) storage class based on documentation and code that predate 2021 updates to the Intelligent Tiering storage class. Intelligent Tiering became suitable as an initial S3 storage class even for objects that are small and/or will be deleted early. It also gained a heavily-discounted access tier. Older internal runbooks, lifecycle policies (including ones specified in infrastructure-as-code templates), scripts, programs, and public examples may still default to Standard-IA, inflating storage costs.


This inefficiency report compares Standard-IA with Intelligent Tiering. It is not intended to cover other storage classes. S3 storage is billed per gibibyte or GiB (powers of 2) rather than per gigabyte or GB (powers of 10), which matters for small objects and also for large volumes of storage.


Relative to the Standard storage class, the Standard-IA storage class offers a moderate, constant storage price discount but imposes a minimum billable object size of 128 KiB, a minimum storage duration of 30 days, and a per-GiB retrieval charge.


In contrast, AWS updated the Intelligent Tiering storage class in September, 2021, eliminating the minimum storage duration and exempting small objects from a monthly per-object monitoring and automation charge. Intelligent Tiering never had retrieval charges. In November, 2021, AWS added the heavily-discounted Archive Instant Access tier.


For objects stored beyond a few months, Intelligent Tiering's progressive storage price discounts surpass Standard-IA's constant discount. Storage savings accumulate each month. Objects in the Intelligent Tiering storage class automatically move through progressively cheaper access tiers unless the objects are accessed. Intelligent Tiering also avoids Standard-IA's minimum billable object size and minimum storage duration penalties.

Orphaned Cloud Storage from Dropped External Delta Tables in Databricks
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Unused Resource

When external Delta tables are dropped from Databricks Unity Catalog or the legacy Hive metastore, only the table metadata is removed — the underlying data files in cloud object storage (such as S3, ADLS, or GCS) remain untouched and continue to incur per-GB-month storage charges. This behavior is by design: external tables decouple metadata from data lifecycle management, meaning Databricks explicitly does not delete the underlying storage when an external table is dropped. The result is orphaned storage — files that no longer have any catalog reference, are not consumed by any downstream pipeline, and deliver no business value, yet continue to accumulate charges indefinitely.

This pattern is particularly prevalent in environments using medallion architecture (bronze/silver/gold layers), where tables are frequently recreated during pipeline evolution, schema experimentation, or migration between environments. Development and test workloads compound the problem, as teams routinely create and abandon external table references without cleaning up the associated storage. Unlike managed tables in Unity Catalog — which have a retention period with recovery capability before automatic deletion — external tables offer no such safety net. The orphaned storage is structurally invisible to standard cost dashboards because it appears as generic object storage charges, not as Databricks-specific line items. Over time, this silent accumulation can represent a meaningful share of an organization's total storage spend.

Importantly, Databricks VACUUM operations do not address this pattern. VACUUM cleans up old file versions within active Delta tables, but it cannot act on storage paths that have been completely disconnected from catalog metadata through external table drops. The only way to reclaim this storage is to manually identify and delete the orphaned files in cloud storage.

Unexpired Non-Current Object Versions in S3
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Missing Lifecycle Policy

When S3 versioning is enabled but no lifecycle rules are defined for non-current objects, outdated versions accumulate indefinitely. These non-current versions are rarely accessed but continue to incur storage charges. Over time, this leads to significant hidden costs, particularly in buckets with frequent object updates or automated data pipelines. Proper lifecycle management is required to limit or expire obsolete versions.

Excessive KMS Charges from Missing S3 Bucket Key Configuration
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Misconfiguration

S3 buckets configured with SSE-KMS but without Bucket Keys generate a separate KMS request for each object operation. This behavior results in disproportionately high KMS request costs for data-intensive workloads such as analytics, backups, or frequently accessed objects. Bucket Keys allow S3 to cache KMS data keys at the bucket level, reducing the volume of KMS calls and cutting encryption costs—often with no impact on security or performance.

Unused S3 Storage Lens Advanced
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Unused Resource

S3 Storage Lens Advanced provides valuable insights into storage usage and trends, but it incurs a recurring cost. Organisations often enable it during an optimization initiative but fail to turn it off afterwards. When no active storage efficiency efforts are underway, these advanced metrics can become an unnecessary expense, especially at large scale across many buckets.

Advanced metrics include things like:

Cost optimization recommendations

Data retrieval patterns

Encryption and access control analytics

Historical trends beyond the 30-day free tier

Because the configuration is global or account-level, it’s easy to forget that these additional metrics are enabled and quietly incurring cost. This inefficiency often surfaces in organizations that over-invest in observability tooling without aligning it to an ongoing operational workflow.

Misaligned S3 Storage Tier Selection Based on Access Patterns
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Misconfigured Storage Tier

While moving objects to colder storage classes like Glacier or Infrequent Access (IA) can reduce storage costs, premature transitions without analyzing historical access patterns can lead to unintended expenses. Retrieval charges, restore time delays, and early delete penalties often go unaccounted for in simplistic tiering decisions. This inefficiency arises when teams default to colder tiers based solely on perceived “age” of data or storage savings—without confirming access frequency, restore time SLAs, or application requirements. Unlike inefficiencies focused on *underuse* of cold storage, this inefficiency reflects *overuse* or misalignment, resulting in higher total costs or operational friction.

Excessive CloudTrail Charges from Bulk S3 Deletes
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Misconfigured Logging

When large numbers of objects are deleted from S3—such as during cleanup or lifecycle transitions—CloudTrail can log every individual delete operation if data event logging is enabled. This is especially costly when deleting millions of objects from buckets configured with CloudTrail data event logging at the object level. The resulting volume of logs can cause a significant, unexpected spike in CloudTrail charges, sometimes exceeding the cost of the underlying S3 operations themselves. This inefficiency often occurs when teams initiate bulk deletions for cleanup or cost savings without realizing that CloudTrail logs every API call, including `DeleteObject`, if data event logging is active for the bucket.

Missing S3 Gateway Endpoint for Intra-Region EC2 Access
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Inefficient Configuration

When EC2 instances within a VPC access Amazon S3 in the same region without a Gateway VPC Endpoint, traffic is routed through the public S3 endpoint and incurs standard internet egress charges — even though it remains within the AWS network. This results in unnecessary egress charges, as AWS treats this traffic as data transfer out to the internet, billed under the S3 service.

By contrast, provisioning a Gateway Endpoint for S3 allows traffic between EC2 and S3 to flow over the AWS private backbone at no additional cost. This configuration is especially important for data-intensive applications, such as analytics jobs, backups, or frequent uploads/downloads, where the cumulative data transfer can be substantial.

Because the egress cost is billed under S3, it is often misattributed or overlooked during EC2 or networking reviews, leading to silent overspend.

Suboptimal Lifecycle Policy for Small Files on an S3 Bucket
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Inefficient Configuration

This inefficiency occurs when small files are stored in S3 storage classes that impose a minimum object size charge, resulting in unnecessary costs. Small files under 128 KB stored in Glacier Instant Retrieval, Standard-IA, or One Zone-IA are billed as if they were 128 KB. If these small files are accessed frequently, S3 Standard may be a better fit. For infrequently accessed small files, transitioning them to archival storage classes like Glacier Flexible Retrieval or Deep Archive can optimize storage spend. Poorly tuned lifecycle policies often allow small files to remain in suboptimal storage classes indefinitely.

Missing S3 Lifecycle Policy for Incomplete Multipart Uploads
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Inefficient Configuration

Multipart upload allows large files to be uploaded in segments. Each part is stored individually until the upload is finalized by a “CompleteMultipartUpload” request. If this final request is never issued—due to a timeout, crash, failed job, or misconfiguration—the parts remain stored but are effectively useless: they do not form a valid object and cannot be retrieved. Without a lifecycle policy in place to clean up these incomplete uploads, the orphaned parts persist and continue to incur storage charges indefinitely.

There are no inefficiency matches the current filters.