Skip to main content
Solved

How to setup a continuous data replication pipeline from SQL Server to Snowflake?

  • February 8, 2022
  • 1 reply
  • 229 views

Drew Kreiger
Rock star
Forum|alt.badge.img
  • Senior Community Builder at StreamSets
  • 95 replies

We would like to setup a continuous data replication pipeline from SQL Server to Snowflake, completed with historical data for several hundred tables. Would like some documentation/assistance with creating a test pipeline for our use case.

Best answer by Drew Kreiger

Here is a link to our documentation on the SQL Server CDC Client. That will describe how to setup SQL Server for CDC and about all the option for that origin. This link describes how to process change data.

 

In addition, here is a link to our github repository which shows an example of SQL Server CDC to Snowflake.

 

To properly set up a full CDC for SQL Server, you will want one pipeline that does the bulk load. That will use the JDBC Multitable Consumer Origin to read all the SQL Server tables and replicate them into Snowflake. Once the bulk load is complete, you can then start the SQL Server CDC pipeline, which runs continuously, to capture the changes and write them into Snowflake.


 

View original
Did this topic help you find an answer to your question?

1 reply

Drew Kreiger
Rock star
Forum|alt.badge.img
  • Author
  • Senior Community Builder at StreamSets
  • 95 replies
  • Answer
  • February 8, 2022

Here is a link to our documentation on the SQL Server CDC Client. That will describe how to setup SQL Server for CDC and about all the option for that origin. This link describes how to process change data.

 

In addition, here is a link to our github repository which shows an example of SQL Server CDC to Snowflake.

 

To properly set up a full CDC for SQL Server, you will want one pipeline that does the bulk load. That will use the JDBC Multitable Consumer Origin to read all the SQL Server tables and replicate them into Snowflake. Once the bulk load is complete, you can then start the SQL Server CDC pipeline, which runs continuously, to capture the changes and write them into Snowflake.


 


Reply