Skip to main content

I am trying to capture change data from SQL Server database for which I have enabled the CDC for the source table in SQL Server database and loading the data to Snowflake destination in StreamSets, I am getting below error while running/previewing the pipeline in StreamSets Control Hub Platform. Please find the below error: 

SNOWFLAKE_27 - CDC record is missing the ‘PERSONID’ key field

Note: PERSONID is the Primary key specified both in SQL Server and Snowflake tables 

I tried with checking and unchecking the “Get Primary Key Information” option and similarly with table auto create option. But still, I am facing the same error. Could anyone please help me fix this issue. Thanks in advance. Happy to share more details if required.

 

To achieve this I am using SQL Server CDC Client origin and Snowflake destination using StreamSets Data Collector 5.3.0 version engine


@Divjo 

If table is already available in snowflake then we don’t need to create a new table by enabling table create option.

Kindly disable and try if its allowing you to load data into snowflake.

 

Thanks & Regards

Bikram_


@Bikram  I have already tried disabling the auto table create option when I created the table manually in Snowflake, but it is still throwing the SNOWFLAKE_27 error.


@Divjo 

can you please check the source column name (PERSONID) having the match with snowflake column name.

The error is saying that the column is not available.

Please make sure the column name should match and snowflake by default consider the  column name in upper case.

 

Thanks & Regards

Bikram_

 


@Bikram

Initially source column name was (PersonID) and I changed it to (PERSONID), but it was throwing same error.

Latest observation -  I tried unchecking Processing CDC Data (Use MERGE) and auto table create options in Snowflake destination in Streamsets, manually created the table in Snowflake. 

  • inserted one record in the source table (for example- pk - 29)  → record inserted to the table in snowflake (total 1 record)
  • updated same record in the source table ( for pk - 29, lastname is updated to ‘test2’ ) → 2 records inserted in snowflake table (total 3 records)

But I am not able to do the same by checking Processing CDC Data (Use MERGE) option in Snowflake destination in StreamSets, which will help me track which is old data and new data for update operation I believe. 

 

Thanks

Divya


update - ticket is raised to Streamsets on SQL Server CDC - Data Drift or Table auto create not working


Any update on this issue. I spent lot of time trying to make Snowflake CDC to work without any luck.


@va7777, data drift is still not working with  CDC. Are you using SQL server source ?


Reply