Jan 25, 2025 - ⧖ 6 min
Introduction
When dealing with high-concurrency workloads, scaling AWS Lambda effectively while avoiding throttling can become a challenge. This article explores a real-world scenario where an application, written in Kotlin, processed over 100,000 records using a custom asynchronous iteration method. Each record triggered an asynchronous Lambda invocation that interacted with DynamoDB. However, the setup led to 429 Too Many Requests errors, indicating throttling issues with AWS Lambda.
We will:
- Outline the problem faced while processing high-concurrency workloads.
- Understand AWS Lambda throttling mechanisms, based on the AWS Compute Blog article by James Beswick.
- Present solutions to mitigate throttling.
- Provide a real-world proof of concept (POC) to evaluate each mitigation technique.
The Challenge
Problem Context
Our workload involved processing a large file of over 100,000 records. Using Kotlin's mapAsync extension function, we implemented concurrency to invoke an AWS Lambda function for each record. The Lambda function performed a putItem operation on DynamoDB.
Here’s the Kotlin code for mapAsync:
suspend fun <T, R> Iterable<T>.mapAsync(
transformation: suspend (T) -> R
): List<R> = coroutineScope {
this@mapAsync
.map { async { transformation(it) } }
.awaitAll()
}
suspend fun <T, R> Iterable<T>.mapAsync(
concurrency: Int,
transformation: suspend (T) -> R
): List<R> = coroutineScope {
val semaphore = Semaphore(concurrency)
this@mapAsync
.map { async { semaphore.withPermit { transformation(it) } } }
.awaitAll()
}
While this method processed records significantly faster than a standard for loop, it caused the Lambda invocations to "flood" the system, triggering throttling. The 429 Too Many Requests errors were linked to:
- Concurrency Limits: Lambda limits the number of concurrent executions per account.
- TPS (Transactions Per Second) Limits: High TPS can overwhelm the Invoke Data Plane.
- Burst Limits: A limit on how quickly concurrency can scale up, using the token bucket algorithm.
Observed Errors
- 429 Too Many Requests: Errors indicated that the Lambda invocations exceeded the allowed concurrency or burst limits.
- DynamoDB “Provisioned Throughput Exceeded” errors were observed when spikes occurred in DynamoDB writes.
AWS Lambda Throttling Mechanisms
AWS enforces three key throttle limits to protect its infrastructure and ensure fair resource distribution:
1. Concurrency Limits
Concurrency defines the number of in-flight Lambda executions allowed at a time. For example, if your concurrency limit is 1,000, you can have up to 1,000 Lambda functions executing simultaneously. This limit is shared across all Lambdas in your account and region.
2. TPS Limits
TPS is a derived limit based on concurrency and function duration. For example:
- Function duration: 100 ms
- Concurrency: 1,000
TPS = Concurrency / Function Duration = 10,000 TPS
However, if function duration drops below 100 ms, TPS is capped at 10x the concurrency.
3. Burst Limits
The burst limit ensures that concurrency increases gradually, avoiding large spikes in cold starts. AWS uses a token bucket algorithm to regulate this:
- Each invocation consumes a token.
- The bucket refills at a fixed rate (e.g., 500 tokens per minute).
- The bucket has a maximum capacity (e.g., 1,000 tokens).
For more details, refer to the AWS Lambda Burst Limits.
Mitigation Strategies
Here are some techniques we implemented to mitigate the throttling issues:
1. Limit Concurrency Using Semaphore
We added a concurrency limit to the mapAsync function to control the number of simultaneous Lambda invocations:
val results = records.mapAsync(concurrency = 100) { record ->
invokeLambda(record)
}
Pros:
- Simple to implement.
- Reduces
429errors significantly.
Cons:
- Slower overall processing time due to limited concurrency.
2. Retry with Exponential Backoff
We implemented a retry mechanism with exponential backoff to handle throttled requests:
suspend fun invokeWithRetry(record: Record, retries: Int = 3) {
var attempts = 0
while (attempts < retries) {
try {
invokeLambda(record)
break
} catch (e: Exception) {
if (++attempts == retries) throw e
delay((2.0.pow(attempts) * 100).toLong())
}
}
}
Pros:
- Handles transient errors gracefully.
- Avoids overwhelming the system during retries.
Cons:
- Adds latency.
- Increases code complexity.
3. Use SQS for Decoupling
Instead of invoking Lambdas directly, we used SQS to queue the requests and let the Lambdas process them at a controlled rate:

Pros:
- Decouples producers and consumers.
- Avoids throttling by controlling the consumer rate.
Cons:
- Adds architectural complexity.
- Increases latency due to queueing.
Proof of Concept (POC)
We tested each mitigation strategy using the following setup:
Test Setup
- Dataset: 100,000 records.
- AWS Lambda: 512 MB memory, default concurrency limits.
- Environment: Local machine (32 GB RAM, 8 cores) for testing
mapAsync.
Implementing the Lambda Function
The Lambda function was written in Go and performed a putItem operation on DynamoDB:
package main
import (
"context"
"fmt"
"github.com/aws/aws-lambda-go/lambda"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/dynamodb"
)
type Request struct {
TableName string `json:"tableName"`
Item map[string]*dynamodb.AttributeValue `json:"item"`
}
func handler(ctx context.Context, req Request) (string, error) {
sess := session.Must(session.NewSession())
svc := dynamodb.New(sess)
_, err := svc.PutItem(&dynamodb.PutItemInput{
TableName: aws.String(req.TableName),
Item: req.Item,
})
if err != nil {
return "", fmt.Errorf("failed to put item: %v", err)
}
return "Success", nil
}
func main() {
lambda.Start(handler)
}
Invoking the Lambda in Kotlin
We invoked the Lambda function from Kotlin using AWS SDK for Java:
import software.amazon.awssdk.services.lambda.LambdaClient
import software.amazon.awssdk.services.lambda.model.InvokeRequest
import software.amazon.awssdk.services.lambda.model.InvokeResponse
fun invokeLambda(record: String): String {
val lambdaClient = LambdaClient.create()
val payload = """
{
"tableName": "MyTable",
"item": {
"id": { "S": "$record" },
"value": { "S": "SomeValue" }
}
}
""".trimIndent()
val request = InvokeRequest.builder()
.functionName("MyLambdaFunction")
.payload(payload)
.build()
val response: InvokeResponse = lambdaClient.invoke(request)
return response.payload().asUtf8String()
}
Results
| Mitigation Strategy | Total Time | Throttled Requests | Notes |
|---|---|---|---|
| No Concurrency Limitation | 15 min | 1,500 | High throughput but unstable |
| Concurrency Limit (100) | 25 min | 0 | Stable, slower |
| Retry with Backoff | 20 min | 200 | Improved with retries |
| SQS Decoupling | 30 min | 0 | Most stable, added latency |
Conclusion
High-concurrency workloads require careful consideration of AWS Lambda’s throttling limits. By applying strategies such as concurrency control, retry mechanisms, or decoupling with SQS, you can mitigate throttling and improve system stability. Each solution has trade-offs, so the choice depends on your specific use case and performance requirements.
For more details on Lambda throttling, refer to the AWS Lambda Developer Guide and the AWS Compute Blog.
Jan 25, 2025 - ⧖ 6 min