SDC 3.3+ spark2 cluster pipeline application on MapR enters FAILED state in SDC UI, despite application being submitted and running in YARN


Userlevel 3

Issue:

On an SDC 3.3+ instance pointing to a MapR cluster, starting a cluster pipeline (spark2) goes into FAILED state, despite the application showing up in the YARN Resource Manager UI as 'RUNNING'.

 

Solution:

As SDC depends on the parsing of INFO-level logging by spark-submit to determine the application start state, this issue can happen if the log4j rootCategory for spark-submit is at WARN level, which is the default for a MapR cluster.  To get around this issue, change the log4j rootCategory from WARN to INFO for Spark (e.g. in the $SPARK_HOME/conf/log4j.properties file).  Afterwards, re-submit the pipeline.


0 replies

Be the first to reply!

Reply