deno_lint
deno_lint copied to clipboard
Rule suggestion: Warn for Unicode bidirectional overrides
Unicode has some bidirectional override characters. They are meant to be used to help properly display words in languages with a particular directionality within a text in the opposite directionality (i.e. Hebrew words in an otherwise English text, since Hebrew is RTL and English LTR). The result is that bidi-aware tools will display text in an ordering that doesn't match the ordering of the corresponding bytes.
Bidi override characters can, however, be abused to manipulate how source code is displayed in bidi-aware editors and code review tools, leading to the reviewed code being different than the compiled code. For example, if the Unicode escapes in the following code snippet were replaced by the actual Unicode characters:
if (access_level !== "user\u202E \u2066// Check if admin\u2069 \u2066") {
grant_access();
}
it would be rendered by bidi-aware tools as:
if (access_level !== "user") { // Check if admin
grant_access();
}
There should be a rule to warn if one of those characters is ever present in a comment, string literal, template literal, or regular expression literal, suggesting to replace them with the corresponding Unicode escape.
The code points in question are U+202A, U+202B, U+202C, U+202D, U+202E, U+2066, U+2067, U+2068, U+2069.
Ref: https://blog.rust-lang.org/2021/11/01/cve-2021-42574.html
This lint as proposed would make life significantly harder for people who do use bidirectional text, so in the long run it would be better to try and detect bidi logical blocks, although that would be much more complex.