Product: StreamSets Data Collector
Issue:
When using a pipeline with a JDBC Consumer Origin running on SDC3.18.1 against a MySQL/MariaDB instance, the performance is markedly degraded. In addition, errors like the following are generated in large volumes in the sdc.log:
2020-09-10 18:24:14,194 [user:*admin] [pipeline:pipeline name/pipelinename9e88e92e-0129-46c0-bd1e-fccb78a9cc33] [runner:] [thread:ProductionPipelineRunnable-pipelinename9e88e92e-0129-46c0-bd1e-fccb78a9cc33-pipeline name] [stage:] ERROR JdbcUtil - Got type 12, columnTypeName=VARCHAR, catalogName=test
Versions affected:
3.18.1
Solution:
This was an issue found with SDC3.18.1 and tracked via SDC-15731[1].
As a workaround to avoid the issue, you can disable the JdbcUtil class's logging output entirely, thus preventing the log message generation and subsequent performance impact. To disable the JdbcUtil class's log output:
- Navigate to the UI of the SDC instance running the pipeline in question.
- Open the Log Config window by going to Administration (top-right menu) > Logs > Log Config
- Add the following line to the end of the Log Config window:
log4j.logger.com.streamsets.pipeline.lib.jdbc.JdbcUtil=OFF
Once the above is added to the Log Config page, save the configuration and rerun the pipeline - no SDC restart is required*.
[1] https://issues.streamsets.com/browse/SDC-15731
*If using Cloudera Manager to deploy SDC, the Log Config may need to be updated directly in the Cloudera Manager configuration page for the SDC service - this will require a restart of the SDC instance.