Skip to main content

Hi Team,

I need to perform a join operation in StreamSets Control Hub using SDC(Data Collectors). As I checked we don’t have any processor or stage in StreamSets SDC. 

 

Is there any workaround we have in SDC. Below are my datasets,

Dataset 1 - CSV

Dataset 2 - Snowflake table

 

Any way to join Dataset 1 and Dataset 2 in SDC.

 

TIA

@Rishi @Bikram  @Maria Vila @XavierV  @AkshayJadhav 

@yogesh0590 

 

Please find below the screenshot for your reference.

In where condition , left side value  is from table and right side value from the source .

 

 

 


@Bikram 

 

I have never used JDBC look up, if you can help me with sample configuration then it will be easy for me to implement. 


@yogesh0590 

 

In SDC, there isn't a built-in join operation, but StreamSets Transformer makes it straightforward to manage.

Unlike SDC, where configuring more than one origin can be challenging, in your case, you can simply read data from a CSV file and then utilize JDBC lookup to retrieve data from Snowflake based on the information obtained from the CSV file.

 

e.g

Data received from CSV is emp = ‘Bikram’

In jdbc lookup 

select emp from Test

where emp = ‘${record:value(‘/emp’)}’

 

This will check if emp data is available in snowflake db or not , we can add as much filter condition .

Please give a try and let me know if it helps.

Thanks & Regards

Bikram_

 

 


Reply