JDBC Multi-Table Consumer can only read from one database.

3 years ago
February 17, 2022
1 reply
232 views

AkshayJadhav
StreamSets Employee
101 replies

JDBC Multi Table Consumer can only read from one database. In some cases there is a requirement to get data from multiple databases - perhaps the table structure is similar or the databases contain related information.

There is one simple options to read from 2 databases in one pipeline -- use database-level Synonyms.
Example:

The following creates a synonym for the base table, "Product" in the AdventureWorks2012 database:
CREATE SYNONYM MyProduct
FOR AdventureWorks2012.Production.Product;

For SQL Server synonyms, more info can be found in the following links:
https://docs.microsoft.com/en-us/sql/relational-databases/synonyms/synonyms-database-engine?view=sql-server-2017
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-synonym-transact-sql?view=sql-server-2017

You can also use two pipelines by creating a pipeline for each database with the JDBC origin. If the pipeline is complex, you can use pipeline fragments to better manage the stages in the rest of the pipeline

Documentation for fragments (Control Hub 3.2.0 and up): https://streamsets.com/documentation/controlhub/3.2.0/help/pdesigner/#datacollector/UserGuide/Pipeline_Fragments/PipelineFragments.html%23concept_msg_4hf_ndb

3. Create one pipeline with JDC Multi-table origin for each database and feed into an 'combined' database or a queueing system such as Kafka. Then make a third pipeline that reads from the 'combined' database.

Did this topic help you find an answer to your question?

S

+1

Srinivasan Sankar
Headliner
27 replies
3 years ago
February 17, 2022

Hi @AkshayJadhav ,

Nice topic and relevant to the work that I am doing. I guess the context needs to be set for each option.

Database Synonyms is good thought process. I haven’t tried that option - thanks for highlighting.

Reg. Option 2

If the Table structures and the ingest process are the same (i.e. no specific transformations are required) ,

Data Collector (Open Source Version) - Yes, you need to create two pipelines.
Control Hub - Since both the table structures are the same and ONLY the source / host name and / or the database names are different, you could create one pipeline then leverage Job template and have separate instances of the same pipeline each pointing to a different host / database as Job Instances.

Since you are creating a single pipeline you may not require pipeline fragments unless you want to call a pipeline fragment for every pipeline (some sort of standard process in your organization).

Reg. Option 3

Yes, irrespective of whether you are working on the open source version or a Control Hub version, you will need a separate pipeline to read data that was ingested in the previous step from two sources.

Well, if you are using Transformer I guess you don’t need to worry about having a separate pipeline.

Cheers,

Srini

Srinivasan Sankar

Reply

Related topics

IGS CAD Export erroricon

CAD export error to STEP in optics studio 22.2icon

CAD export erroricon

CAD export erroricon

How to debug CAD Import/Export Errors?

Tags

Couldn't find what you're looking for?

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded