Issue
An SDC instance, with kerberos enabled, running a pipeline against a kerberized Hadoop cluster is unable to read/write from the underlying Hadoop FS. The following error message is seen in the SDC pipeline logs:
HADOOPFS_43 - Could not create a file/directory under base directory: 'java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException:
GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)'
The root of this stack trace points to an issue with the credentials stored in the keytab, specifically with the "password" portion of the credentials:
Caused by: javax.security.auth.login.LoginException: No password provided at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:919) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760)
Symptoms
- The SDC and Hadoop systems are both kerberized.
- The SDC pipeline is pointing to a filepath on the Hadoop FS that is within an Encryption Zone.
- Changing the SDC pipeline's Hadoop Stage Library to CDH5.X or HDP2.X does not incur the failure.
Solution
This is a bug within Hadoop's KmsClientProvider code which breaks all externally-managed Kerberos keytabs, causing renewal failures and ultimately Hadoop client authentication failures. This is being tracked via HADOOP-16761[1], which reverted a regression introduced in HDFS-13682. You will need to engage your Hadoop vendor for a fix.
As a workaround, you can make use of the CDH5.X or HDP2.X stage libraries for all Hadoop FS stages as the regression was not introduced until Hadoop3.0 (included in CDH6.X and HDP3.X releases respectively). Alternatively, having a cron job execute a script to manually renew your SDC keytab on a timer will also workaround the renewal failure that ultimately leads to the failure seen here.
[1] https://issues.apache.org/jira/browse/HADOOP-16761