Skip to main content

Product: StreamSets Transformer

Issue:

Getting below error while reading and writing via Hive stage in Cluster mode

com.streamsets.datatransformer.dag.error.OperatorFailureException: Operator Hive_01 failed due to org.apache.spark.SparkException
at com.streamsets.datatransformer.dag.BaseBatchDAGRunner.com$streamsets$datatransformer$dag$BaseBatchDAGRunner$$generateOperatorFailure(BaseBatchDAGRunner.scala:664)
at com.streamsets.datatransformer.dag.BaseBatchDAGRunner$$anon$3.call(BaseBatchDAGRunner.scala:722)
at com.streamsets.datatransformer.dag.BaseBatchDAGRunner$$anon$3.call(BaseBatchDAGRunner.scala:698)
at com.streamsets.datatransformer.dag.BaseBatchDAGRunner$$anonfun$materialize$2$$anon$4.call(BaseBatchDAGRunner.scala:739)
at com.streamsets.datatransformer.dag.BaseBatchDAGRunner$$anonfun$materialize$2$$anon$4.call(BaseBatchDAGRunner.scala:735)
at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:34)
at com.streamsets.datatransformer.dag.BaseBatchDAGRunner$$anon$2.call(BaseBatchDAGRunner.scala:691)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.SparkException: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict

 

Solution:

We need to add the below property in the pipeline cluster configs to resolve this issue

spark.hadoop.hive.exec.dynamic.partition=true
spark.hadoop.hive.exec.dynamic.partition.mode=nonstrict

 

Be the first to reply!