Submit feedback on
Unmanaged Growth of Athena Query Output Buckets
We've received your feedback.
Thanks for reaching out!
Oops! Something went wrong while submitting the form.
Close
Unmanaged Growth of Athena Query Output Buckets
Abdeldjallil Koutchoukali
Service Category
Compute
Cloud Provider
AWS
Service Name
AWS Athena
Inefficiency Type
Missing Lifecycle Policy
Explanation

Athena generates a new S3 object for every query result, regardless of whether the output is needed long term. Over time, this leads to uncontrolled growth of the output bucket, especially in environments with repetitive queries such as cost and usage reporting. Many of these files are transient and provide little value once the query is consumed. Without lifecycle rules, organizations pay for unnecessary storage and create clutter in S3.

Relevant Billing Model

Athena query execution is billed per terabyte of data scanned, but query results are stored in S3 and billed according to S3 storage pricing. Each executed query produces an object in the output bucket, and costs accumulate as these objects persist over time without automated cleanup.

Detection
  • Review whether Athena query output buckets have lifecycle rules configured to delete or transition old objects
  • Assess growth trends in the S3 bucket size used for Athena outputs relative to actual business need
  • Check for repetitive or automated queries (e.g., CUR queries) that generate large volumes of transient results
  • Confirm whether audit, compliance, or reporting requirements justify long-term retention of certain outputs
Remediation
  • Implement S3 Lifecycle Policies on Athena output buckets to automatically expire objects after a set period (e.g., 30, 60, 90 days)
  • Use prefixes or tags to differentiate between temporary query outputs and long-term reports, applying tailored retention rules
  • Regularly review and adjust retention policies to balance cost efficiency with business and compliance needs
Relevant Documentation
Submit Feedback