Solved

How to pass Table Name as an attribute while producing events from hadoop fs?

  • 19 January 2022
  • 4 replies
  • 85 views

My pipeline is configured to pick data from JDBC Multitable Consumer Origin and put it in Hadoop FS Destination. My requirement is to rename the output file at destination with TableName_TimeStamp.

Able to achieve TimeStamp using Expression Evaluator. How do I get TableName as an event passed from Hadoop Fs to be used in HDFS File Metadata?

icon

Best answer by Abhishek Sarda 27 January 2022, 09:15

View original

4 replies

Userlevel 2
Badge +1

You might want to look at Record Header attribute ‘jdbc.tables’ which has the list of table names referred in JDBC stage query. 

 

${record:attribute('jdbc.tables')} 

https://docs.streamsets.com/portal/datacollector/latest/help/datacollector/UserGuide/Origins/MultiTableJDBCConsumer.html#concept_ofs_p54_rkb

Thanks Pradeep. ${record:attribute('jdbc.tables')} is holding the table name till it reaches Hadoop FS destination. However, that when accessed in the events produced by Hadoop FS, it isn’t available.

Userlevel 2
Badge +1

@Abhishek Sarda Since I do not have the pipeline, I am guessing what you can try. You can rename the files in ‘Hadoop FS’ destination itself? Try using record attribute/value for jdbc.tables, timestamp in ‘Files Prefix’ or ‘Files Suffix’ as needed. This way output files written via Hadoop FS will have the table name and timestamp. As the filename is passed in the event record of Hadoop FS you can update file properties as needed in HDFS File Metadata executor? 

Thanks Pradeep. Achieved it via below method.

My file path was /<DBName>/<SchemaName>/<TableName>/

${file:pathElement(record:value('/filepath'),2)}

Reply