Error: com.amazonaws.services.sqs.model.BatchEntryIdsNotDistinctException in a pipeline with SQS Origin.


Userlevel 4
Badge

When running a pipeline with SQS origin it might fail in execution with the following exception:

com.amazonaws.services.sqs.model.BatchEntryIdsNotDistinctException: Id f05c2fed-b7e6-487f-8028-9aa9e8d90bee repeated. (Service: AmazonSQS; Status Code: 400; Error Code: AWS.SimpleQueueService.BatchEntryIdsNotDistinct; Request ID: a780be6a-aca8-56f6-949c-e8e473853624)

Reason:

Today SQS consumer can be configured with a list of queue prefixes which could select multiple queues, Each queue is set up with a default visibility timeout. The default visibility timeout should strictly be more than the max batch wait time, as without messages might be replayed and this can cascade causing BatchEntryIdsNotDistinctException on deletion.

1. SQS has 15 messages with default visibility of 30 seconds
2. The pipeline is started with a batch size of 10, max batch wait time of 100 seconds.
3. Pipeline produces batch 1 with 10 records and deletes those messages from SQS, currently, SQS will have 5 messages
4. The pipeline starts reading batch 2, it will read 5 records and will have to wait for 100 seconds or wait till it can read 10 messages, given that visibility timeout is 30 seconds, after 30 seconds, the 5 messages will be available for reading again, and the pipeline will reread those 5 messages and generate a batch (basically 5 records repeated twice)

How to fix it:

In order to avoid this problem, we would need to change the default visibility timeout in the AWS SQS queue to a greater value than the max batch wait time or reduce the Max Batch Wait Time property in the pipeline (click the origin -> Amazon SQS tab -> Max Batch Wait Time (ms)default: 10000000ms) to less than visibility timeout in the SQS queue (which defaults to 30 seconds).


0 replies

Be the first to reply!

Reply