When should I use Cluster mode?

2 years ago
17 January 2022
0 replies
25 views

Userlevel 4

AkshayJadhav
StreamSets Employee
101 replies

The Cluster mode should be used to read data from a Kafka cluster or HDFS. The Data Collector uses a cluster manager and a cluster application to spawn additional workers as needed.

Cluster streaming mode - when processing data from a Kafka cluster, the Data Collector processes data continuously until you stop the pipeline.

Cluster batch mode - when processing data from HDFS, the Data Collector processes all available data and then stops the pipeline.

You can find more information in the documentation here:

https://streamsets.com/documentation/datacollector/latest/help/index.html#Cluster_Mode/ClusterPipelines.html#concept_hmh_kfn_1s

This topic has been closed for comments

Couldn't find what you're looking for?

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded