Skip to main content

How to name output files with dynamic timestamp?

  • December 21, 2021
  • 0 replies
  • 525 views

AkshayJadhav
StreamSets Employee
Forum|alt.badge.img

QUESTION:

Can I use dynamic timestamp to name the files when writing data with Local FS destination?

 

SOLUTION:

Currently, it is by design that the SDC doesn't allow using timestamp in the filenames when writing data with Local FS and Hadoop FS destination.

 

There are two options how to use timestamp when writing the files:

1. Using datetime variables in Directory Template

You could use datetime variables, such as ${YYYY()} or ${DD()} in Local FS destination --> Directory Template. The destination creates directories as needed, based on the smallest datetime variable that you use. For example, if the smallest variable is hours, then the directories are created for every hour of the day that receives output records.

/tmp/out/${YY()}-${MM()}-${DD()}-${hh()}

 For more information, please see our documentation here.

 

2. Using HDFS File Metadata executor to rename the files

You could also use HDFS File Metadata executor in the pipeline along with the Local FS destination. This would allow you to rename the files which are written with the Local FS destination.

Attached is a sample pipeline which uses events generation in Local FS destination, Expression Evaluator processor to generate the timestamp and HDFS File Metadata executor which renames the file each time it receives a file closure event.

Please always make sure that you use a unique name for each file so you would not rewrite some files and lose the data.

Did this topic help you find an answer to your question?
This topic has been closed for comments