When should I use Cluster mode?

  • 17 January 2022
  • 0 replies
  • 25 views

Userlevel 4
Badge

The Cluster mode should be used to read data from a Kafka cluster or HDFS. The Data Collector uses a cluster manager and a cluster application to spawn additional workers as needed.

 

Cluster streaming mode - when processing data from a Kafka cluster, the Data Collector processes data continuously until you stop the pipeline.

Cluster batch mode - when processing data from HDFS, the Data Collector processes all available data and then stops the pipeline.

 

You can find more information in the documentation here:

https://streamsets.com/documentation/datacollector/latest/help/index.html#Cluster_Mode/ClusterPipelines.html#concept_hmh_kfn_1s

 


This topic has been closed for comments