OverRunLimit Exception

3 years ago
August 24, 2021
1 reply
1420 views

Drew Kreiger
Senior Community Builder at StreamSets
95 replies

Following exception may occur when you're trying to read large records:

OverrunException: Reader exceeded the read limit '1024000'

When this happens, you may need to check the buffer limit (Buffer Limit (KB)) or another limitation of the Data Type in the pipeline (for example,Max Record Length (chars) for Delimited Data Format).

These are limitations which can be configured for each pipeline differently.

(Examples of the exceptions for Directory and Amazon S3 origins, you can find here - Parser Overrun Errors or Max record length (chars) property doesn't take effect. )

If changing the buffer limit has not helped, you may need to configure the parser limitation on the SDC level which is set by default to 1 MB. You may see the following exception:

...java.lang.IllegalArgumentException: overRunLimit '4280000' must be 
greater than 0 and less than or equal to 1048576

The overrun limit is related to the maximum number of bytes that can be read for a single record. If you need to increase the number, please follow these steps:

For SDC 2.6 and earlier versions, you need to add to the DC environment configuration file (sdc-env.sh/sdcd-env.sh) this line:

export SDC_JAVA_OPTS="${SDC_JAVA_OPTS} -DDataFactoryBuilder.OverRunLimit=2097152"

For the SDC versions later than 2.6, we introduced parser.limit configuration available in sdc.properties file (or based on the installation, in Cloudera Manager):

parser.limit=2097152

For Cloudera Manager, we introduced parser.limit configuration available in "Data Collector Advanced Configuration Snippet (Safety Valve) for sdc.properties".:

parser.limit=2097152

In both cases, you must restart the Data Collector. After this change, the limit will be changed to 2MB.

--------------------------------------------------------------------------------------------------------------

parser.limit - Maximum parser buffer size that origins can use to process data. Limits the size of the data that can be parsed and converted to a record.

Buffer Limit (kB) - Maximum buffer size. The buffer size determines the size of the record that can be processed. Decrease when memory on the Data Collector machine is limited. Increase to process larger records when memory is available.

Max Record Length (chars) - The maximum number of characters in a record. Longer records are diverted to the pipeline for error handling.

-@aneta

March 01, 2021 12:28

Did this topic help you find an answer to your question?

T

thirty2
Fan
2 replies
2 years ago
January 13, 2023

What in case if i am getting in HTTP Server origin same error: CONTAINER_0001 - com.streamsets.pipeline.api.ext.io.OverrunException: Reader exceeded the read limit '100000000'?

In Configuration i can’t set higher number than 100MB.

Thanks

Reply

Related topics

RingCentral Service Outageicon

How long is the RingCentral Service outage going to last? It's been over 1 hour!icon

Going on 4 hours of total service disruption, complete outage!icon

Service outage notification by smsicon

🎙️Update 1/23/2025 🚨 RingCentral Service Update: Calling-Inbound & Calling-Outbound 🚨

Tags

Couldn't find what you're looking for?

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded