Many times when we are troubleshooting transformer job we are required to review spark driver logs.
Spark runs as a YARN application and supports two deployment modes:
- Client mode: The default deployment mode. In client mode, the Spark driver runs on the host where the spark-submit command is executed. Which means on the same machine where transformer is running.
- Cluster mode: The Spark driver runs in the application master. The application master is the first container that runs when the Spark job executes.
Client mode jobs
When you submit a Transformer job in client mode, the Spark application is submitted spark-submit with --deploy-mode client args.
Since Spark driver runs on the transformer host, these logs are available under the following directory path.
${TRANSFORMER_DIST}/data/runInfo/${pipelineId}/runTimestamp/driver-all.log
Cluster mode jobs
When you submit the Spark application in cluster mode, the driver process runs in the application master container. The application master is the first container that runs when the Spark application executes. The client logs the YARN application report. To get the driver logs:
- Get the application ID from the transformer logs. In the following
for example, application_1572839353552_0008 is the application ID.
19/11/04 05:24:42 INFO Client: Application report for application_1572839353552_0008 (state: ACCEPTED)
- Identify the application master container logs. The following is an example list of Spark application logs.
In this list, container_1572839353552_0008_01_000001 is the first container, which means that it's the application master container.
On a running cluster, you can use the YARN CLI to get the YARN application container logs. For a Spark application submitted in cluster mode, you can access the Spark driver logs by pulling the application master container logs like this:
# 1. Get the address of the node that the application master container ran on
$ yarn logs -applicationId application_1585844683621_0001 | grep 'Container: container_1585844683621_0001_01_000001'
20/04/02 19:15:09 INFO client.RMProxy: Connecting to ResourceManager at ip-xxx-xx-xx-xx.us-west-2.compute.internal/xxx.xx.xx.xx:8032
Container: container_1585844683621_0001_01_000001 on ip-xxx-xx-xx-xx.us-west-2.compute.internal_8041
# 2. Use the node address to pull the container logs
$ yarn logs -applicationId application_1585844683621_0001 -containerId container_1585844683621_0001_01_000001 -nodeAddress ip-xxx-xx-xx-x