Parser Overrun Errors or Max record length (chars) property doesn't take effect.

3 years ago
February 19, 2022
0 replies
814 views

AkshayJadhav
StreamSets Employee

When parsing XML, Json or CSV files you may get an exception such as:

Directory Origin (reading XML files):

"SPOOLDIR_01 - Failed to process file '/tmp/out/dir27/file31.xml' at position '0': com.streamsets.pipeline.stage.origin.spooldir.BadSpoolFileException: com.streamsets.pipeline.lib.parser.DataParserException: XML_PARSER_02 - XML object exceeded maximum length: readerId 'file13.xml', offset '0', maximum length '2147483647'"

or S3 Origin reading JSON:

com.streamsets.pipeline.stage.origin.s3.BadSpoolObjectException: com.streamsets.pipeline.api.ext.io.OverrunException: Reader exceeded the read limit '1048576'

In both cases, regardless of the value set in the Data Format tab's "Max Object Length (chars) " field, Data Collector's internal parsing buffer defaults to 1mb. Trying to parse a file greater than the size of the internal buffer - not the size specified in the UI - results in the error.

To provide a 10mb buffer for XML, JSON and CSV parsing update the parser.limit parameter in the sdc.properties file, eg:

parser.limit=10485760

Then restart Data Collector.

You can also check the following article: OverRunLimit Exception

Did this topic help you find an answer to your question?

Reply

Related topics

AMA with Nate Vogel, VP Enablement at Gong

SHHHH. Here are 21+ of the best-kept Gong secrets, as shared by fellow Gong Community members

AMA with Nate Vogel, VP of Enablement - July 12, 2021

SALES MANAGERS: How do you use Deals to support your coaching initiatives?

12 Days of Gong: How Deel’s Senior Sales Director Mike Gallardo uses Gong to hire and coach a world class team of AEs

Tags

Couldn't find what you're looking for?

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded