StreamSets Support team would like to inform you of an issue where StreamSets Data Collector pipelines using destinations such as Azure Synapse, Delta Lake, Google BigQuery, or Snowflake might corrupt the data for the DATE/DATETIME/TIME fields.
Products Affected: StreamSets Data Collector
Releases Affected: StreamSets Data Collector 5.6.0 and prior versions
Users Affected: StreamSets Data Collector pipelines using Azure Synapse destination, Delta Lake destination, Google BigQuery destination, and Snowflake destination.
Severity: High
Description: It has been identified that in some cases, DATE/DATETIME/TIME datatype fields could be corrupted. Users could be finding that their DATE/DATETIME/TIME data fields contain incorrect dates. This issue happens only if all the following conditions take place:
- The pipeline writes to one of the following stages:
- Azure Synapse destination
- Delta Lake destination
- Google BigQuery destination
- Snowflake destination.
- The pipeline writes to multiple tables and an EL expression is used in the table field.
- Connection pool size is different from 1.
Immediate action required: Yes
Workaround: Yes. Set the Connection Pool Size to 1. Please note that changing this setting might affect the pipeline throughput.
Resolution: Users who run Data Collector pipelines with one of the affected destination stages - Azure Synapse, Delta Lake, Google BigQuery, or Snowflake, should upgrade Data Collector to the 5.6.1 version at their earliest convenience. In the meantime, users can follow the workaround and set the connection pool size to 1. If users cannot update or apply the workaround please open a support case with us and let us know.
For your questions, please contact our Support Team via our Support portal or at support@streamsets.com.