Issue:
On an SDC 3.3+ instance pointing to a MapR cluster, starting a cluster pipeline (spark2) goes into FAILED state, despite the application showing up in the YARN Resource Manager UI as 'RUNNING'.
Solution:
As SDC depends on the parsing of INFO-level logging by spark-submit to determine the application start state, this issue can happen if the log4j rootCategory for spark-submit is at WARN level, which is the default for a MapR cluster. To get around this issue, change the log4j rootCategory from WARN to INFO for Spark (e.g. in the $SPARK_HOME/conf/log4j.properties file). Afterwards, re-submit the pipeline.