Unable to ingest data from Azure SQL (CDC) to Azure Data bricks using Stream Sets.

Trying to build data pipeline for Azure SQL Server DB (CDC) as source and Azure Data bricks (Delta tables) as destination

I have referred data pipeline sample from
https://github.com/streamsets/pipeline-library/tree/master/datacollector/sample-pipelines/pipelines/SQLServer%20CDC%20to%20Delta%20Lake

Getting below error for few records in Schema preview as-well:

DELTA_LAKE_34 - Databricks Delta Lake load request failed: 'DELTA_LAKE_32 - Could not copy staged file 'sdc-4a076fce-7a73-45ba-8dd7-29e58848cf23.csv': java.sql.SQLException: :Simba]aSparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Unable to infer schema for CSV. It must be specified manually.

Note : On Preview/Draft Run → Pipeline is able to capture changes from Source DB, successfully created files in stage (ADLS container) and created Delta tables at destination but it it fails to ingest records there.

Page 1 / 1

@gkognole I have seen these kind of errors when the file starts with something like _ or is empty. From the looks of if, your filenames start with sdc- so could be a good idea to check if any temp files are being created and read from.

@gkognole

Could it be that you are using an unsupported version of the cluster? (we support 6.x, 7.x and 8.x only)

@gkognole I have seen these kind of errors when the file starts with something like _ or is empty. From the looks of if, your filenames start with sdc- so could be a good idea to check if any temp files are being created and read from.

Thank you @saleempothiwala for your reply.

Yes, my stage file name starts with sdc- and there are no temp files created with _

@gkognole

Could it be that you are using an unsupported version of the cluster? (we support 6.x, 7.x and 8.x only)

Thank you @alex.sanchez for your reply.

I am using Databricks Runtime Version : 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12).

I will give try using 8.x version if it resolves the issue.

Reply

Couldn't find what you're looking for?

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded