TextSnatcher icon indicating copy to clipboard operation
TextSnatcher copied to clipboard

Invisible Unicode character at end of all text

Open matt-laird opened this issue 2 years ago • 3 comments

There seems to be a U+000c invisible Unicode character at the end of all generated text. This causes problems in some applications when pasting resulting text. See below example, problem on line 2: image

matt-laird avatar Aug 28 '23 15:08 matt-laird

for a long time I have also faced this issue, is just a string trim fine ? so is there something with tesseract that I should configure any ideas ?

RajSolai avatar Aug 28 '23 15:08 RajSolai

I had a brief look, it does seem to be an artifact from Tesseract's process, maybe give this a read and see if the different options help at all - Tesseract FAQ, unfortunately I can't test these myself right now.

matt-laird avatar Aug 29 '23 11:08 matt-laird

I think we can trim the string for now I guess, thanks now I also got the Exact unicode to find and remove

RajSolai avatar Aug 29 '23 11:08 RajSolai