python-ftfy icon indicating copy to clipboard operation
python-ftfy copied to clipboard

Wrong mojibake fix for: âš´

Open MicroJackson opened this issue 4 years ago • 1 comments

Hi, Ftfy fixes this problem incorrectly.

U+00E2  â       [Ll] LATIN SMALL LETTER A WITH CIRCUMFLEX
U+0161  š       [Ll] LATIN SMALL LETTER S WITH CARON
U+00B4  ´       [Sk] ACUTE ACCENT
SHOULD BE:
U+00EB  ë       [Ll] LATIN SMALL LETTER E WITH DIAERESIS
But FTFY fixes to: 
U+26B4  ⚴       [So] PALLAS

Example:

Officiâš´le gecreâš´erde patatten becomes Offici⚴le gecre⚴erde patatten print(ftfy.fix_text("Officiâš´le gecreâš´erde patatten", uncurl_quotes=False))

Is it possible to fix this?

MicroJackson avatar Mar 12 '20 12:03 MicroJackson

What in the world.

This is definitely one to leave open and try to understand. Thanks.

rspeer avatar Mar 12 '20 16:03 rspeer