Inefficient Lambda Pricing Model for Steady High-Volume Workloads (Use Lambda Managed Instances)

Go back

Andrew Shieh

CER:

CER-0296

Service Category

Compute

Cloud Provider

AWS

Service Name

AWS Lambda

Inefficiency Type

Suboptimal billing model selection

Explanation

This inefficiency occurs when a function has steady, high-volume traffic (or predictable load) but continues running on default Lambda pricing, where costs scale with execution duration. Lambda Managed Instances runs Lambda on EC2 capacity managed by Lambda and supports multi-concurrent invocations within the same execution environment, which can materially improve utilization for suitable workloads (often IO-heavy services). For these steady-state patterns, shifting from duration-based billing to instance-based billing (and potentially leveraging EC2 pricing options like Savings Plans or Reserved Instances) can reduce total cost—while keeping the Lambda programming model. Savings are workload-dependent and not guaranteed.

Relevant Billing Model

Default Lambda charges per request and per-request execution duration. Lambda Managed Instances charges per request plus EC2 instance charges and a Lambda management fee, and does not charge separately per-request duration.

Detection

Review whether the function has sustained, predictable traffic rather than bursty “scale-to-zero” patterns
Assess whether paying for per-request duration is a dominant cost driver for the workload
Evaluate whether the workload can benefit from multi-concurrent processing (e.g., IO-heavy request handling)
Confirm the workload can tolerate a minimum always-on capacity model (i.e., it is not primarily cost-optimized by being idle most of the time)

Remediation

Consider Lambda Managed Instances for steady-state, predictable workloads where instance-based pricing and multi-concurrency can improve price-performance
Validate cost and operational fit by comparing expected instance-based spend (including management fee) versus current duration-based Lambda spend before broad adoption
Use default Lambda for bursty or sporadic workloads that benefit most from scale-to-zero economics

Relevant Documentation

Submit Feedback