streamsets not picking up records from s3 bucket

  • 9 March 2022
  • 6 replies

we are ingesting data from oracle to s3 bucket and s3 to data bricks delta table using streamsets.

S3 to delta bricks pipeline runs every 15mins. we observed that at times s3 to delta pipeline is not picking up any records form s3 bucket even if has the records present.

Just wanted to no, what might be the issue?

is their any way to reset origin and start every time on scheduler level as the jobs are scheduled every 15mins?

6 replies

Userlevel 2

Hi @harshith,

If you are a customer, this is worth a support ticket since there are multiple things that can be going on.

Apart from that, yes, you can run a pipeline resetting the origin, not sure if it is available as an option for scheduled jobs.

Are those files being pickup of by future executions?

Anyway, check if the S3 origin is properly configured to pick up new files.

alex.sanchez thanks for your reply,to answer your question

No, files are not being picked by future executions.

yes s3 origin is properly configured , as it was picking the records few days back

Userlevel 2

In the S3 origin, are you reading lexicographically or based on timestamp?

alex.sanchez how to check that alex?


Userlevel 2

It can be configured in the stage

Userlevel 5

@harshith Moved your question to the correct category, Got a Question?