universal-data-tool icon indicating copy to clipboard operation
universal-data-tool copied to clipboard

Respecting formating when importing text data

Open mroyce1 opened this issue 5 years ago • 1 comments

Hi,

I was wondering whether there is a way to preserve the text formating when importing text data. I'm currently working on a NER problem where it would be nice if the structure of the text could be preserved. For example, '\n' is simply displayed as '\ n' in UDT, even though I'd like it to be interpreted as a line break. Is there a way to change that?

Thanks!

mroyce1 avatar Nov 19 '20 13:11 mroyce1

Yes with wordSplitRegex! More info on the format page

I think to fix it you'll want to set "wordSplitRegex" to [a-zA-ZÀ-ÿ\\]+

Related to #374

seveibar avatar Nov 23 '20 13:11 seveibar