Skip to main content

Splitting a fixed width array

  • January 31, 2022
  • 1 reply
  • 177 views

mblahay
Discovered Fame

I’ve been working with some fixed width records recently and one of the tasks was to split a string field into an array of like sized strings. The problem however seemed insurmountable using the field splitter stage as a delimiter is required. I ended up using the Jython evaluator to accomplish the task. Later on, I happened across an article talking about a similar problem in the context of using the Java regular expression engine. The article provided a solution using the \G boundary match combined with a look behind.

For example:  (?<=\G.{4})

When used in the field splitter stage this regex will split the string into equal length parts that are 4 characters in size.  The regex essentially creates a zero length delimiter which enforces a 4 character separation between the last delimiter and the current one.

Shame on me for loosing the url for the original article. If I come across it again I will be sure to site it here.

1 reply

Drew Kreiger
Rock star
Forum|alt.badge.img
  • Senior Community Builder at StreamSets
  • 95 replies
  • February 7, 2022

Thanks @mblahay for sharing. Fingers crossed you find the article 🤞!