Skip to main content

When should I use Cluster mode?


AkshayJadhav
StreamSets Employee
Forum|alt.badge.img

The Cluster mode should be used to read data from a Kafka cluster or HDFS. The Data Collector uses a cluster manager and a cluster application to spawn additional workers as needed.

 

Cluster streaming mode - when processing data from a Kafka cluster, the Data Collector processes data continuously until you stop the pipeline.

Cluster batch mode - when processing data from HDFS, the Data Collector processes all available data and then stops the pipeline.

 

You can find more information in the documentation here:

https://streamsets.com/documentation/datacollector/latest/help/index.html#Cluster_Mode/ClusterPipelines.html#concept_hmh_kfn_1s

 

Did this topic help you find an answer to your question?
This topic has been closed for comments