Question

After S3 Event need to capture file name in expression evaluator

  • 5 December 2022
  • 8 replies
  • 197 views

Hi All,

we are new in streamsets !!

In our usecase we are uploading file in s3 which predefined name as abcd_sggshs_31-04-2018.csv

we need to capture this above parameter which are in our file name,we are generating events after uploading file in s3,using expression evaluator we are trying to capture this parameter in json file in s3.

but we are not getting desired output.

below are the expression we are using please suggest the correct expression to get the file name.

 

 

 


8 replies

Hi,

Try ${record:id()}

 

 

Userlevel 5
Badge +1

@Priya151997 

 

can you please try this.

 

${str:split(record:value('/fileInfo/filename'),'_')}

If it didn’t help , please provide me the preview output of origin , so i can check the file naming format and based on that I will prepare the logic to get file name.

 Hello @Bikram ,

I am working on a similar kind of scenario, where I have a requirement to capture the file info at the beginning of pipeline (i.e Amazon S3 origin) and at the end of pipeline the processed file is written to Amazon S3 destination with a different filename. So, I am using the events at Amazon S3 origin and S3 destination to capture the file info.

Do you know how to refer the Amazon s3 origin file info in the events generated at Amazon s3 destination? 

Userlevel 5
Badge +1

@hg508 

can you please if it helps.

 

 

@Bikram , Really appreciate for your quick response.

Let’s say I have a source File_101.txt in S3 bucket which is referred thru S3 origin stage in the pipeline, the file is processed and written to S3 bucket as a new name ‘File_101_Processed.csv’ thru S3 destination stage.

How can I refer the source file name ‘File_101.txt at  s3 destination event in expression EXP_DESTN?

 

 

Hi hg508, not sure if you have an answer to this yet but you could try an Expression evaluator after your origin (I’m using Directory - take a look at the record header using preview to see the name of the attribute holding the filename for an S3 bucket)

Then the fields could be:

 

This is picking up the ‘filename’ attribute of the record header for the origin.

This would produce a new field in your record (taken from the origins’ header record.)

 

@Russ Webb , Actually, I was looking to refer the S3 origin filename after successfully writing the file to S3 Destination, and the events produced ( by enabling ‘Produce Events’ under ‘General’ tab of S3 destination) are passed to an expression evaluator(in my above example the expression evaluator name is ‘EXP_DESTN’ ). Please note the expression evaluator EXP_DESTN is connected to the event of S3 destination.

So, the sdc.event.type at s3 destination is “S3 Object Written” where it shows the new file info, but no reference to the S3 source file info.

I am looking to refer the S3 source file info (in my example it is ‘File_101.txt’ ) in the expression evaluator EXP_DESTN linked to the events of S3 destination.

Please let me know if there is way to refer the source file info at S3 destination->events->expression.

Hi hg508,

So, you can send your filename down the pipeline using:

 

from your expression evaluator.

See the new field I created above called Filename.

But I’m told that the data is not shared in the event lane (of my ‘Amazon S3 2’ in my case). So my ‘Expression evaluator 2’ cannot see that record data - it would only see event data.

I’m also told that a jython/groovy evaluator might be able to pick it up but I’d have to go research that.

Russ.

 

Reply