Skip to main content

How to orchestrate pipelines and pass the table name to JDBC pipeline in runtime

  • November 24, 2021
  • 0 replies
  • 660 views

Sami
StreamSets Employee
  • StreamSets Employee

 

Scenario:

  • Run an orchestration pipeline which iterates through table names and passes the table name as a run time parameter to JDBC query pipeline

Goal:

  • Goal here is explained below

Here is the sample jobs and pipeline i created

My goal here is Pipeline 1 has the below stagesJDBC

Query -> Start Job -> Trash

JDBC Query returns

SQL> select * from refer;TABLE_NAME
--------------------------------------------------------------------------------
DBO1
DBO2
DBO3
DBO4
DBO5
DBO6

6 rows selected.

Now these are 6 tables that we want to pass one by one to pipeline 2

Pipeline 2 is a JDBC Query origin that we want to run and pass the inputs of table name from Pipeline 1 and run the JDBC query for each table.

Solution:

You can achieve this by using Job templates. 

Please find the screenshot and the Job and pipeline sample zip files attached for the working prototype.

 

Here is screenshot of the 2 pipelines

Orchestration pipeline. The start jobs in this case calls a Job Template which is created  on the second pipeline

The JDBC Query here returns the list of table names: 

The first pipeline provides list of tables. The table names are dynamically passed to this second pipeline using a job template.This will create one instance of job for each table

The key here is the below orchestration configuration in pipeline 1: 

 

lease note you may either enable the Run in Background or disable this configuration

When Run in Background is enabled the jobs are kicked off parallel at the same time.

This means if you have 20 tables returning in the JDBC query all 20 tables jobs will be kicked off at the exact same time.

When Run in Background is disabled the jobs are kicked off serially.

This means if you have 20 tables returning in the JDBC query at a time only one table will be run. It will wait for the job for that table to complete and kick off the next job which will run in serial and wait for the current job to complete.

Did this topic help you find an answer to your question?
This topic has been closed for comments