Skip to main content

my origin is directory reading csv source  and write it to AWS S3 BUCKET when my source any data is updated means I need to rerun my pipeline ,how can I achieve it .

@lakshmi_narayanan_t 

 

Can you use kafka processor in your case , if yes then it will  be solve your problem.

 

Pipeline 1:

Read  data from source and send it to Kafka producer .

Pipeline 2 :

Fetch data from kafka topic and send to S3 bucket . 

In this case you will get the updated data in case of any changes from the source.

 

Please let me know if it helps , else i will help you on the second approach to come over your issue.

 


@lakshmi_narayanan_t  depending upon the read order configured the directory origin will automatically pick up new files as and when they arrive as long as the pipeline is running continuously or runs based on a regular schedule.  


Reply