Solved

I used the JDBC Multitable table consumer and the tables were autocreated. I assumed the same would work with the SQL Server CDC Client.

  • 8 February 2022
  • 2 replies
  • 118 views

Userlevel 5
Badge

I used the JDBC Multitable table consumer and the tables were autocreated. I assumed the same would work with the SQL Server CDC Client.

icon

Best answer by Drew Kreiger 8 February 2022, 22:06

View original

2 replies

Userlevel 5
Badge

If you are using a CDC origin, that processes Inserts, Deletes, and Updates. If the table does not exist, you will encounter problems. For example, if you are trying to process an update on a table that does not exist, there will be no row to update. The proper process to set up CDC would be to create a bulk load pipeline that does JDBC multitable Consumer to Snowflake and create the table then. Once the bulk load has been run, you can then start the CDC pipeline which will keep it in sync.


 

Hello, thank you very much for the information.

I need help because I have a problem that I can't solve.

Context:

I am developing a CDC integration with the following characteristics:

  • A pipeline that performs the first data load and creates the tables in snowflake (jdbc multi table → snowflake connector).
  • A second pipeline that performs continuous integration using the Stage CDC for Sql Server. ( sql server cdc → snowflake connector)
  • The integration is with hundreds of tables, so it does not support the possibility of manual management and specific configurations at the object level.
  • The update strategy is to configure the snowflake connector that loads the CDC transactions, so that it does a MERGE of the data, saving only the latest version of the row.

 

Problem: 


The Multitable stage creates the table in snowflake and moves the historical data. But I am finding that it does not create PKs that ARE defined in the source (sql server in this case).
Is there any way to replicate the PK of the source table with the Multitable stage?
My motivation for this is that in order to later configure a second pipeline that merges the CDC transactions, I need to have a PK in snowflake to be able to enable the "Get Primary Key Information From Snowflake" option in snowflake stage (target), the situation where manually define the PKs in an environment where I have belts of boards is not viable, fr that reason we need to replicate the PK when the JDBC multitable create the table in the first load (in the first pipeline). 

 

So… Is there any way to replicate the PKs with the JDBC multitable? Or, another stage that complemented with JDBC MT can solve this problem?

 

 

Thanks in advance. 

Reply