Skip to main content
Solved

Kafka consumer in Transformer throws java.lang.NoClassDefFoundError


Mike Arov
Roadie

Hello, could you please help me out?

In Transformer pipelines adding a Kafka consumer throws

java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer$

The same connection works fine with Data Collector.

I am using local Spark in docker 

Thanks!

Best answer by Rishi

@Mike Arov , Could you please please confirm scala version of your Spark cluster and also check the transformer Scala version ? Make sure both matches

View original
Did this topic help you find an answer to your question?

5 replies

Rishi
StreamSets Employee
Forum|alt.badge.img
  • StreamSets Employee
  • 96 replies
  • Answer
  • July 25, 2022

@Mike Arov , Could you please please confirm scala version of your Spark cluster and also check the transformer Scala version ? Make sure both matches


Mike Arov
Roadie
  • Author
  • Roadie
  • 5 replies
  • July 25, 2022

Thanks you! It looks like they match:

  • Transformer 5.0.0 Scala 2.12
  • Spark library version: 3.0.3
  • kafka-clients-2.6.0.jar

I am also using local[*] ….


Mike Arov
Roadie
  • Author
  • Roadie
  • 5 replies
  • July 25, 2022

One thing I did notice was that /opt/streamsets-transformer/streamsets-libs/streamsets-spark-kafka-lib/lib/ contained libs for 3.0.2, while the Spark version was 3.0.3:

Manually downloaded 3.0.3 jars, but it did not make a difference :(

  • spark-sql-kafka-0-10_2.12-3.0.3.jar
  • spark-tags_2.12-3.0.3.jar
  • spark-token-provider-kafka-0-10_2.12-3.0.3.jar

Mike Arov
Roadie
  • Author
  • Roadie
  • 5 replies
  • July 26, 2022

Solved it!

I changed to 2.11 Scala version and it worked out of the box:

  • Transformer 5.0.0 Scala 2.11

 

It appears 

  • Transformer 5.0.0 Scala 2.12

has a bug ...


Mike Arov
Roadie
  • Author
  • Roadie
  • 5 replies
  • August 4, 2022

Now I was able to get Transformer 5.0.0 Scala 2.12 to work as well!

Turns out https://repo1.maven.org/maven2/org/apache/commons/commons-pool2/2.11.0/commons-pool2-2.11.0.jar was missing!

 

wget https://repo1.maven.org/maven2/org/apache/commons/commons-pool2/2.11.0/commons-pool2-2.11.0.jar
sudo mv commons-pool2-2.11.0.jar /opt/streamsets/spark-3.0.3-bin-hadoop3.2/jars/

This got Kafka stage working, but I think this Jar needs to be packaged by Transformer deployment installer. @Rishi, maybe you can fix it in next release? ;)


Reply