Question

Bulk ingestion

  • 2 January 2023
  • 6 replies
  • 43 views

I want to ingest 10 tables from Oracle can i ingest them with one pipeline or do i have to create 10 pipelines.
if we can’t ingest them at once can do we have to create 10 jobs? or is there any alternate option (like master job or for each loop in azure)   


6 replies

You can use the “JDBC Multitable Consumer” in one pipeline if all the tables are in the same DB. More info: https://docs.streamsets.com/portal/platform-datacollector/latest/datacollector/UserGuide/Origins/MultiTableJDBCConsumer.html?hl=jdbc%2Cmultitable%2Cconsumer

if we have 10 pipelines, can we run them using a single job? or does it require 10 jobs?

 

When you run multiple pipeline instances for a Data Collector job, each pipeline instance runs on a separate Data Collector. The pipeline instances do not communicate with each other. Each pipeline instance simply completes the same set of instructions

From https://docs.streamsets.com/portal/platform-controlhub/controlhub/UserGuide/Jobs/Jobs-PipelineInstances.html#concept_abz_mkl_rz

You can run multiple instances of the same pipeline in one job, but if you want to run different pipelines, you will have to set up multiple jobs, one per pipeline.

how to run 10 jobs at a time?

 

https://docs.streamsets.com/portal/platform-controlhub/controlhub/UserGuide/Jobs/Jobs.html#concept_omz_yn1_4w

Userlevel 5
Badge +1

@Mahesh 

you can use orchestration to handle multiple jobs in a good way and in this case you can add precondition if the data loading successfully done by previous then move with other jobs else stop there.

 

As mentioned by @albertfc , you can use JDBC multitable consumer and fetch data from tables and move forward.

If tables are having less data then it will work fine , in case it has millions of records in each table then it will give performance issue.

Reply