unicode-security-guide
unicode-security-guide copied to clipboard
Normalization: Identify Ruby's default Normalization form
See: http://websec.github.io/unicode-security-guide/character-transformations/#normalization
Identify Ruby's normalization form when handling Unicode - is this documented? If so, skip the following tests.
If not documented, test major versions to identify:
- normalization behavior - what normalization form do the core Encoding APIs use by default?
One way to test this might be to use a few specific code points which have known transformations in certain normalization forms. These include (from http://www.unicode.org/reports/tr15/):
U+212B in NFC becomes U+00C5 U+212B in NFD becomes U+0041 U+030A The sequence U+1E9B U+0323 in NFKC becomes U+1E69 The sequence U+1E9B U+0323 in NFKD becomes U+0073 U+0323 U+0307