Skip to main content

How to handle multiple output file for each partition scenario?


AkshayJadhav
StreamSets Employee
Forum|alt.badge.img

Question:

When writing to a File destination, Spark creates one output file for each partition. However, the goal is to create on a single output file

 

Answer:

Use the Repartition processor to change the number of partitions that are written to file systems.

Use Repartition by Number strategy and set the following parameter as below

Number of Partitions = 1 

Did this topic help you find an answer to your question?

0 replies

Be the first to reply!

Reply