Replacing non-ASCII characters in a file

  • 25 August 2023
  • 0 replies
  • 49 views

Userlevel 6
Badge

It is often the case in a document mapping that a source file can contain special characters that cause problems later down the line - notably in EDI mappings where non-printable characters cannot be allowed. 

 

Arc will treat strings as UTF-8 when reading from files, but sometimes you may want a bit of code to replace the non-printable characters in a string. The following Script:

 

<arc:set attr="file.file" value="[FilePath]" />
<arc:call op="fileRead" in="file" out="fileo">
<arc:set attr="output.data" value="[fileo.file:data | regexreplace('\[^ -~\]','')]" />
</arc:call>
<arc:set attr="output.filename" value="[filename]" />
<arc:push item="output" />

 

Can be used to replace any characters outside of the printable ASCII range (here, defined as anything not between the whitespace and tilde characters in the ASCII set) with an empty string. This script will replace characters across a whole file, but you can use the regexreplace statement on a single element (such as in the expression editor in an XML Map).


0 replies

Be the first to reply!

Reply