Skip to main content

we are ingesting data from oracle to s3 bucket and s3 to data bricks delta table using streamsets.

S3 to delta bricks pipeline runs every 15mins. we observed that at times s3 to delta pipeline is not picking up any records form s3 bucket even if has the records present.

Just wanted to no, what might be the issue?

is their any way to reset origin and start every time on scheduler level as the jobs are scheduled every 15mins?

Hi @harshith,

If you are a customer, this is worth a support ticket since there are multiple things that can be going on.

Apart from that, yes, you can run a pipeline resetting the origin, not sure if it is available as an option for scheduled jobs.

Are those files being pickup of by future executions?

Anyway, check if the S3 origin is properly configured to pick up new files.


alex.sanchez thanks for your reply,to answer your question

No, files are not being picked by future executions.

yes s3 origin is properly configured , as it was picking the records few days back


In the S3 origin, are you reading lexicographically or based on timestamp?


alex.sanchez how to check that alex?

 


It can be configured in the stage


@harshith Moved your question to the correct category, Got a Question? 


Reply