In this article, we'll configure the Azure Synapse Dedicated SQL pool and insert some records from Mysql.
Platform:
- Streamsets Data Collector version 5.0
- MySql version 8.0.28.
Pre-requisite:
- Mysql connector driver for JDBC query consumer stage.
- Install Azure Synapse Enterprise Library 1.1.0 from PackageManager to enableAzure Synapse SQL destination.
Configuration:
Step1: When you create the workspace in Azure Synapse, you will be creating a storage account and workspace however it does not create a SQL pool. You would need to create the SQL pool from:
Azure Analytics » Click on Workspace web URL » Manage and then click on “new”. This will create the Azure SQL pool.
data:image/s3,"s3://crabby-images/3c6cd/3c6cde7c53ed343049108e9f9d85a9292e06d272" alt=""
Once done, you will be able to see the SQL pool in the overview tab. “akshaytest” is the new sql pool here.
data:image/s3,"s3://crabby-images/7ec6b/7ec6b4f3c51cc2171f96dde9485037652f2d735a" alt=""
Step2: Develop the pipeline with JDBC Query Consumer origin and Azure Synapse SQL destination.
data:image/s3,"s3://crabby-images/eef60/eef60549c1158d856254e63c3dc8cca828348f8c" alt=""
In the JDBC consumer, we would be reading all the data from the “Azure” mysql database which is hosted on the local machine.
These are the records we would be reading:
data:image/s3,"s3://crabby-images/b7872/b787270614f5bf3fbfcedca2b707b9f0382b23c2" alt=""
Step3: Validate the pipeline to ensure all the configuration are correct and there are validation errors.
data:image/s3,"s3://crabby-images/365cf/365cf55917c06661ba88706a02f31a849207aaad" alt=""
Run the pipeline when you see validation successful. In the summary section, I can see 4 records were process successfully.
data:image/s3,"s3://crabby-images/26785/267859e622019edd3a4df6e8ce6b13a1d49b31e3" alt=""
To verify this, I’ve configure the DBviewer tool to access the Azure SQL instance to check the records available in the database “azure_test”.
data:image/s3,"s3://crabby-images/c4037/c403776a3bf7a4fe1c2a3fd2b88213a6ac2e4452" alt=""
You can use any origin like Dev Data Generator/CDC tracking and process the data to the Azure synapse sql pool.
Note: You can always refer to the documentation section to get more details about the stage configurations. I’m attaching the pipeline to the article for reference.
Azure Synapse SQL:-
JDBC Query Consumer:-