AWS S3 Optimization | Cloud Efficiency Hub

Excessive S3 Request Costs from Filesystem-Oriented I/O Patterns

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Inefficient Architecture

Amazon S3 bills for every API request — LIST, HEAD, GET, PUT, COPY, POST, and DELETE — independently of storage charges. Workloads originally designed for locally-provisioned storage, where listing a directory, checking a file's existence, or writing a single record is effectively free, carry those assumptions into S3 and convert each step into a billed HTTP request. At scale, request costs can rival or even exceed storage costs, yet they are routinely overlooked by cost-optimization efforts that focus on storage class selection and data transfer.

The waste manifests on both sides of the I/O path. On the read and coordination side, applications generate LIST and HEAD storms: legacy commit protocols recursively list output directories to discover what tasks wrote, query engines re-enumerate partitions on every execution, and consumers poll a prefix on a timer to detect new data. On the write side, metrics, event, and log pipelines issue one small PUT per record instead of buffering and flushing in batches, so PUT volume scales linearly with input rate. Rename-based commit protocols compound the problem because S3 has no native rename — each rename is implemented as a COPY followed by a DELETE, doubling the request count per output file.

The root cause is an architectural mismatch: the application treats S3 as a filesystem it can list, stat, and rename cheaply, when S3 is an object store that charges per HTTP call. Fixing the problem requires shifting coordination, state tracking, and batching into the application layer so that S3 serves bytes rather than acting as a coordination mechanism.

Learn more

Orphaned MLflow Training Artifacts and Model Checkpoints in Object Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Unused Resource

Machine learning experimentation workflows — particularly those managed through experiment tracking platforms — generate large volumes of artifacts in object storage. Every training run produces model checkpoints, evaluation outputs, feature snapshots, and tensor logs. Hyperparameter tuning and AutoML workflows amplify this by creating hundreds or thousands of individual runs, each writing its own set of artifacts to locations in S3. When experiments are abandoned, models are never promoted to production, or team members depart, these artifacts remain in storage indefinitely because there is no native lifecycle management for ML experiment artifacts — cleanup must be implemented manually.

The cost impact is driven entirely by object storage capacity charges, which accumulate per GB-month regardless of whether the artifacts are referenced, the experiments are active, or the models are registered. Critically, even when experiment metadata is deleted through the tracking platform, the underlying artifacts in object storage are not automatically purged — they must be removed separately. For organizations training large models, checkpoint files alone can reach hundreds of gigabytes each, and production training pipelines may checkpoint every few hours. Without retention policies, it is common for ML artifact storage costs to grow unchecked and eventually rival or exceed compute costs.

Learn more

S3 Standard - Infrequent Access Used Where Intelligent Tiering Would Be Cheaper

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Suboptimal Pricing Model

Organizations often use the Standard - Infrequent Access (Standard-IA) storage class based on documentation and code that predate 2021 updates to the Intelligent Tiering storage class. Intelligent Tiering became suitable as an initial S3 storage class even for objects that are small and/or will be deleted early. It also gained a heavily-discounted access tier. Older internal runbooks, lifecycle policies (including ones specified in infrastructure-as-code templates), scripts, programs, and public examples may still default to Standard-IA, inflating storage costs.

This inefficiency report compares Standard-IA with Intelligent Tiering. It is not intended to cover other storage classes. S3 storage is billed per gibibyte or GiB (powers of 2) rather than per gigabyte or GB (powers of 10), which matters for small objects and also for large volumes of storage.

Relative to the Standard storage class, the Standard-IA storage class offers a moderate, constant storage price discount but imposes a minimum billable object size of 128 KiB, a minimum storage duration of 30 days, and a per-GiB retrieval charge.

In contrast, AWS updated the Intelligent Tiering storage class in September, 2021, eliminating the minimum storage duration and exempting small objects from a monthly per-object monitoring and automation charge. Intelligent Tiering never had retrieval charges. In November, 2021, AWS added the heavily-discounted Archive Instant Access tier.

For objects stored beyond a few months, Intelligent Tiering's progressive storage price discounts surpass Standard-IA's constant discount. Storage savings accumulate each month. Objects in the Intelligent Tiering storage class automatically move through progressively cheaper access tiers unless the objects are accessed. Intelligent Tiering also avoids Standard-IA's minimum billable object size and minimum storage duration penalties.

Learn more

Orphaned Cloud Storage from Dropped External Delta Tables in Databricks

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Unused Resource

When external Delta tables are dropped from Databricks Unity Catalog or the legacy Hive metastore, only the table metadata is removed — the underlying data files in cloud object storage (such as S3, ADLS, or GCS) remain untouched and continue to incur per-GB-month storage charges. This behavior is by design: external tables decouple metadata from data lifecycle management, meaning Databricks explicitly does not delete the underlying storage when an external table is dropped. The result is orphaned storage — files that no longer have any catalog reference, are not consumed by any downstream pipeline, and deliver no business value, yet continue to accumulate charges indefinitely.

This pattern is particularly prevalent in environments using medallion architecture (bronze/silver/gold layers), where tables are frequently recreated during pipeline evolution, schema experimentation, or migration between environments. Development and test workloads compound the problem, as teams routinely create and abandon external table references without cleaning up the associated storage. Unlike managed tables in Unity Catalog — which have a retention period with recovery capability before automatic deletion — external tables offer no such safety net. The orphaned storage is structurally invisible to standard cost dashboards because it appears as generic object storage charges, not as Databricks-specific line items. Over time, this silent accumulation can represent a meaningful share of an organization's total storage spend.

Importantly, Databricks VACUUM operations do not address this pattern. VACUUM cleans up old file versions within active Delta tables, but it cannot act on storage paths that have been completely disconnected from catalog metadata through external table drops. The only way to reclaim this storage is to manually identify and delete the orphaned files in cloud storage.

Learn more

Unexpired Non-Current Object Versions in S3

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Missing Lifecycle Policy

When S3 versioning is enabled but no lifecycle rules are defined for non-current objects, outdated versions accumulate indefinitely. These non-current versions are rarely accessed but continue to incur storage charges. Over time, this leads to significant hidden costs, particularly in buckets with frequent object updates or automated data pipelines. Proper lifecycle management is required to limit or expire obsolete versions.

Learn more

Excessive KMS Charges from Missing S3 Bucket Key Configuration

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Misconfiguration

S3 buckets configured with SSE-KMS but without Bucket Keys generate a separate KMS request for each object operation. This behavior results in disproportionately high KMS request costs for data-intensive workloads such as analytics, backups, or frequently accessed objects. Bucket Keys allow S3 to cache KMS data keys at the bucket level, reducing the volume of KMS calls and cutting encryption costs—often with no impact on security or performance.

Learn more

Unused S3 Storage Lens Advanced

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Unused Resource

S3 Storage Lens Advanced provides valuable insights into storage usage and trends, but it incurs a recurring cost. Organisations often enable it during an optimization initiative but fail to turn it off afterwards. When no active storage efficiency efforts are underway, these advanced metrics can become an unnecessary expense, especially at large scale across many buckets.

Advanced metrics include things like:

Cost optimization recommendations

Data retrieval patterns

Encryption and access control analytics

Historical trends beyond the 30-day free tier

Because the configuration is global or account-level, it’s easy to forget that these additional metrics are enabled and quietly incurring cost. This inefficiency often surfaces in organizations that over-invest in observability tooling without aligning it to an ongoing operational workflow.

Learn more

Misaligned S3 Storage Tier Selection Based on Access Patterns

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Misconfigured Storage Tier

While moving objects to colder storage classes like Glacier or Infrequent Access (IA) can reduce storage costs, premature transitions without analyzing historical access patterns can lead to unintended expenses. Retrieval charges, restore time delays, and early delete penalties often go unaccounted for in simplistic tiering decisions. This inefficiency arises when teams default to colder tiers based solely on perceived “age” of data or storage savings—without confirming access frequency, restore time SLAs, or application requirements. Unlike inefficiencies focused on *underuse* of cold storage, this inefficiency reflects *overuse* or misalignment, resulting in higher total costs or operational friction.

Learn more

Excessive CloudTrail Charges from Bulk S3 Deletes

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Misconfigured Logging

When large numbers of objects are deleted from S3—such as during cleanup or lifecycle transitions—CloudTrail can log every individual delete operation if data event logging is enabled. This is especially costly when deleting millions of objects from buckets configured with CloudTrail data event logging at the object level. The resulting volume of logs can cause a significant, unexpected spike in CloudTrail charges, sometimes exceeding the cost of the underlying S3 operations themselves. This inefficiency often occurs when teams initiate bulk deletions for cleanup or cost savings without realizing that CloudTrail logs every API call, including `DeleteObject`, if data event logging is active for the bucket.

Learn more

Missing S3 Gateway Endpoint for Intra-Region EC2 Access

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Inefficient Configuration

When EC2 instances within a VPC access Amazon S3 in the same region without a Gateway VPC Endpoint, traffic is routed through the public S3 endpoint and incurs standard internet egress charges — even though it remains within the AWS network. This results in unnecessary egress charges, as AWS treats this traffic as data transfer out to the internet, billed under the S3 service.

By contrast, provisioning a Gateway Endpoint for S3 allows traffic between EC2 and S3 to flow over the AWS private backbone at no additional cost. This configuration is especially important for data-intensive applications, such as analytics jobs, backups, or frequent uploads/downloads, where the cumulative data transfer can be substantial.

Because the egress cost is billed under S3, it is often misattributed or overlooked during EC2 or networking reviews, leading to silent overspend.

Learn more

There are no inefficiency matches the current filters.