qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

audit and protect from Qubes source code for malicious unicode - CVE-2021-42574

Open adrelanos opened this issue 2 years ago • 0 comments

Quote https://trojansource.codes/

The trick is to use Unicode control characters to reorder tokens in source code at the encoding level. These visually reordered tokens can be used to display logic that, while semantically correct, diverges from the logic presented by the logical ordering of source code tokens. Compilers and interpreters adhere to the logical ordering of source code, not the visual order.

tasks:

  • [ ] check if potential existing compromises: scan all Qubes source code for existing unicode
  • [ ] educate existing and future Qubes source code reviewers: add a Qubes source code reviewer policy to a github repository or on the Qubes website which existing and future reviewers need to acknowledge that I understand the issue. More of a reminder, a conversation starter.
  • [ ] remove as much unicode from Qubes source code as possible: by reducing the amount of unicode in Qubes source code, audits for malicious unicode with automated tools gets simpler. If possible, if unicode is considered essential, instead of writing ® when required it should be encoded as ®.
  • [ ] local check by reviewer: document tools that Qubes source code reviewers could/should use to scan future contributions for malicious unicode
  • [ ] remote cursory check: add a github pull request hook that notifies when unicode is included in a pull request (This is just an additional, handy layer of protection. Since infrastructure should be distrusted this alone is not a full solution.)
  • [ ] build scripts / CI scripts: should check if there is unicode in any files except in opt-in expected files. If there is unexpected unicode, the build should error out.
  • [ ] scan upstream projects source code: check if these are compromised by malicious unicode
  • [ ] notify upstream projects: these might not be aware of this issue and already compromised by malicious unicode.

references:

  • https://tech.michaelaltfield.net/2021/11/22/bidi-unicode-github-defense/
  • https://forums.whonix.org/t/detecting-malicious-unicode-in-github-prs/13754

how to check example:

grep_args="--exclude=changelog.upstream --exclude-dir=.git --binary-files=without-match --recursive --color=auto -P -n"
LC_ALL=C grep $grep_args '[^\x00-\x7F]'
LC_ALL=C grep $grep_args "[^[:ascii:]]"

A few other tools might be desirable.

adrelanos avatar Jun 10 '22 11:06 adrelanos