• 23 August 2022
  • 6 replies

HI team,

    I am having 6 API’s where I have to build 3 pipelines for each i.e., API-KAFKA, KAFKA-FILESYSTEM and FS-Target table. Like this  For 6 API’s I have to build and execute 18 pipelines (6*3). Could anyone suggest me  that is there any approach which reduces the number of pipelines ( to push all 6 APIs data into 1 Kafka topic in 1st pipeline, after that I will generate data file from Kafka to FS and FS to target table).

FYI- my target table is same for all APIs.

6 replies

Userlevel 3

@Vikki how similar or different are these API?

If they have similar authentication, more or less similar headers, parameters, etc then it is easy to create 1 pipeline and create multiple jobs and pass parameters that can configure the API calls.

@saleempothiwala all the headers are same( there are more than 50 headers in each API, but we are calling only common 10 columns) and authentication too. I am using same parameters for every API.

Userlevel 3

@Vikki that is easy then. Just create 1 pipeline will all the configs that are same across all APIs and remaining config can be created as parameters. Create 6 jobs for the same pipeline and pass different parameter values. 

@saleempothiwala my apologies, Due to new to this SS I used wrong terminology. Actually my 6APIs are different, for every API I am generating  unique token to authenticate in SS. Like that I am pushing each API data into every unique KAFKA topic( 6 APIs and 6 Kafka topics). What I mean to say in my previous reply is, From every API using field remover we are calling 10 common columns. With that we are generating 6 data files and loading this data to 6 inc tables. after that using union we are pushing data to target table in LZ layer. Is there any approach to give 6 origins(HTTP Client) in my 1st pipeline?



2nd pipe line is KAFKA TO FS and the 3rd one is FS to target table.

Userlevel 3

@Vikki StreamSets SDC allows only 1 origin. Having said that, you can have 1 dummy origin using SDC Raw Data and then 6 HTTP Client processor in 1 pipeline and call 6 APIs in the same pipeline. You can do this serially or in parallel depending upon your requirement,

@saleempothiwala  I will try and let you know. Thanks a lot!