ISSUE: Pipeline is failing with the exception "HIVE_04 - Thrift protocol error: null".
SOLUTION:
If you designed a pipeline with Hive Streaming destination, please consider the following:
For CDH: Hive ACID is not supported by Cloudera, which implicitly means that neither is Hive Streaming. Please also find in our documentation:
The Hive Streaming destination requires Hive version 0.13 or later. Before you use the destination, verify that your Hadoop implementation supports Hive Streaming.
If you plan to write data to Hive within the CDH - Cloudera Hadoop distribution, please consider implementing our Drift Synchronization Solution for Hive. If you are specifically looking to ingest data using the ORC file format, we have recently introduced an executor enabling Avro to ORC conversion in the SDC 3.2.0.0. Please find more information about MapReduce executor in our documentation here.