Question

How to read files from s3 bucket using groovy?

  • 19 June 2023
  • 7 replies
  • 141 views

Can someone provide working code of groovy to read files from s3 bucket?


7 replies

Userlevel 2
Badge

@himanshu1234567 any reasons you can’t use the S3 origin?

@Sanjeevrequirement is to use groovy_scripting only.

Userlevel 2
Badge

@himanshu1234567 would you be able to share more details on the use-case to understand why custom code is needed? The reason I’m advocating for using the built-in origin is because if you go the custom code route then you’ll need to write the code for necessary processing / event generation / errors record handling / offset tracking etc. S3 origin handles all of this automatically. 

 

@Sanjeev  can we apply sql query also in s3 origin because team wants to apply sql query to the data for filtering.

Userlevel 5
Badge +1

@himanshu1234567 

To access data stored in an S3 origin as SQL queries, you will need to set up Athena in your S3 environment.

 

@Bikram ok that’s why they are forcing us to use groovy_scripting as origin.

Userlevel 2
Badge

@himanshu1234567 if the requirement is to query S3 data using SQL then you can use Athena as Bikram suggested. I’m not quite clear on why you want to use Groovy to do that from with-in StreamSets

Reply