Skip to main content

Post-processing capabilities for Hadoop origin

  • February 14, 2022
  • 0 replies
  • 16 views

AkshayJadhav
StreamSets Employee
Forum|alt.badge.img

Scenario:

When processing a large amount of data in Hadoop FS origin, there may be bad data files that fail to be processed.

 

Goal:

For bad data files, it is important to be able to move out the bad data file and continue to process the rest of the data.

 

Solution:

The standalone Hadoop FS origin (available as of Data Collector v3.2.0) provides the option for post-processing, which includes the option to specify an error directory for bad input files that could not be fully processed.

Did this topic help you find an answer to your question?

0 replies

Be the first to reply!

Reply