Question

Transfomer Kafka origin consumer group

  • 21 July 2023
  • 3 replies
  • 28 views

I’m used to configuring the Data Collector Kafka origin with a specific consumer group.  I need this to control Kafka offsets and the Kafka broker requires this.

In Transformer, I don’t see any way to define the consumer group.  How is this done ?


3 replies

Userlevel 5
Badge +1

@p_carm 

 

If you need to consume data from Kafka and perform real-time stream processing, you should use StreamSets Data Collector and take advantage of its Kafka Consumer origin. If you require more complex data transformations at scale, you can use StreamSets Transformer for batch processing with Apache Spark.


In the transformer, data will be processed from Kafka, based on the Kafka topic, eliminating the need for consumer details.

Thanks for that.  Yes.  We’re a combined SDC and Transformer implementation already.  I just had a first look at the Transformer Kafka origin having used the SDC Kafka multitopic origin extensively.  If it doesn’t have a consumer group setting it isn’t usable in any scenario I can foresee.  We have access controls on consumer groups so you can’t just be an arbitrary consumer.

Userlevel 2
Badge

@p_carm it’s a limitation on Spark side which is addressed in Spark v3.x

https://issues.apache.org/jira/browse/SPARK-26350

With Spark v3.0+ you should be able to specify consumer group via additional properties(kafka.group.id)

 

Reply