node-diacritics icon indicating copy to clipboard operation
node-diacritics copied to clipboard

Use Unicode normalization?

Open pnorman opened this issue 8 years ago • 0 comments

Unicode defines normalized forms for characters and character classes.

It might work to normalize strings to NFKD and remove any characters of class Mn (Nonspacing_Mark) (see table 12)

It might be necessary to specially handle conversions like ß to ss

See also python stack overflow answer

pnorman avatar Jul 01 '16 05:07 pnorman