node-diacritics icon indicating copy to clipboard operation
node-diacritics copied to clipboard

List of diacritrics

Open matteocontrini opened this issue 9 years ago • 4 comments

Do you have a list of diacritrics that get converted?

In Italy, the fiscal code ("codice fiscale") has recently changed in a way that all the diacritrics are converted to ASCII characters. This table has been provided for the conversion.

How can I know if those characters are actually supported by your module, given that there's just a list of Unicodes in the source code?

Thanks

matteocontrini avatar Nov 22 '15 15:11 matteocontrini

Do you need programmatic access to the list of diacritics or just want to evaluate the module?

andrewrk avatar Nov 22 '15 20:11 andrewrk

I'll try creating a test. I wanted to know if the module correctly handles all those cases, and it's not easy to know since there's not a list of supported diacritics. But that's fine, I'll try parsing that PDF. I'll let you know

matteocontrini avatar Nov 22 '15 20:11 matteocontrini

Ok, first of all, congratulations, because the module found was able to convert almost every character of that table. But there are some that differ:

Ä gets converted to  A, document says AE
ä gets converted to  A, document says AE
Å gets converted to  A, document says AA
å gets converted to  A, document says AA
Ð gets converted to  DH, document says D
IJ gets converted to  IJ, document says IJ <-- 
ij gets converted to  IJ, document says IJ <-- these 2 are not converted
Ö gets converted to  O, document says OE
ö gets converted to  O, document says OE
Ø gets converted to  O, document says OE
ø gets converted to  O, document says OE
Ü gets converted to  U, document says UE
ü gets converted to  U, document says UE

Note that I uppercased the results becaues that's what the table gives me. The code.

I don't know which variant is the right one in the test results. I can tell you that the PDF table linked above is almost the same from here, which talks about some ISO standards.

matteocontrini avatar Nov 22 '15 22:11 matteocontrini

I think your diaritics 'translation' are about the Italian sound, while this implementation deals with visual

homersimpsons avatar Jul 01 '16 05:07 homersimpsons