svelte-jsoneditor
svelte-jsoneditor copied to clipboard
Unicode and invisible characters
When working with unicode (that is, almost any text), you need to remember many things. Here are a few of them:
-
Invisible characters Invisible characters can behave differently on different devices, browsers, and fonts. They are usually invisible, but they still take up space.
"឴" != ""; "_឴_" != "__";
That's how they are highlighted in the VS Code:
-
Combining character and cursed strings The display of the combining character depends on many factors. They can often display strangely and break the interface and styles.
This is how they are currently displayed in the editor:
That's how they are displayed in the VS Code:
-
Surrogate couples and normalization https://en.wikipedia.org/wiki/Unicode_equivalence https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize
"_._".normalize(); // "_._" "_._".normalize("NFC"); // "_._" "_._".normalize("NFD"); // "_._" "_._".normalize("NFKC"); // "_._" "_._".normalize("NFKD"); // "_._" const name1 = "\u0041\u006d\u00e9\u006c\u0069\u0065"; const name2 = "\u0041\u006d\u0065\u0301\u006c\u0069\u0065"; name1 != name2; // "Amélie" != "Amélie" name1.length != name2.length const name1NFC = name1.normalize("NFC"); const name2NFC = name2.normalize("NFC"); name1NFC == name2NFC; // "Amélie" == "Amélie" name1NFC.length == name2NFC.length
Before and after formatting:
Everything seems to be fine with this in the editor now. I suggest:
- highlight invisible characters
- automatically normalize and decode all strings when pasting or formatting