mocha
mocha copied to clipboard
🐛 Bug: xunit reporter does not strip ansi escape sequences, leading to invalid XML
Prerequisites
- [x] Checked that your issue hasn't already been filed by cross-referencing issues with the
faqlabel - [x] Checked next-gen ES issues and syntax problems by using the same environment and/or transpiler configuration without Mocha to ensure it isn't just a feature that actually isn't supported in the environment in question or a bug in your code.
- [x] 'Smoke tested' the code to be tested by running it outside the real test suite to get a better sense of whether the problem is in the code under test, your usage of Mocha, or Mocha itself
- [x] Ensured that there is no discrepancy between the locally and globally installed versions of Mocha. You can find them with:
node node_modules/.bin/mocha --version(Local) andmocha --version(Global). We recommend that you not install Mocha globally.
Description
When testing code that is throwing exceptions that have strings formatted with chalk or using ANSI escape codes in some other way, the resulting XML will contain the entity reference &x1B; which is an invalid character in XML.
Steps to Reproduce
Prepare minimal test file
const assert = require('assert');
const chalk = require('chalk');
it('Test contains ANSI escape sequences', () => {
assert.ok(false, chalk.red('this is not ok'));
});
Run tests using mocha and xunit reporter
$ /code/.npm-global/bin/mocha -R xunit -O output=xunit.xml
Exception: /code/.npm-global/bin/mocha exited with 1
[tty 12], line 1: /code/.npm-global/bin/mocha -R xunit -O output=xunit.xml
Test fails as expected and creates xunit.xml
<testsuite name="Mocha Tests" tests="1" failures="0" errors="1" skipped="0" timestamp="Wed, 02 Dec 2020 17:59:42 GMT" time="0.003">
<testcase classname="" name="Test contains ANSI escape sequences" time="0"><failure>[31mthis is not ok[39m
+ expected - actual
-false
+true
AssertionError [ERR_ASSERTION]: [31mthis is not ok[39m
at Context.it (test.js:5:9)
at callFn (/code/mocha/lib/runnable.js:366:21)
at Test.Runnable.run (/code/mocha/lib/runnable.js:354:5)
at Runner.runTest (/code/mocha/lib/runner.js:677:10)
at /code/mocha/lib/runner.js:801:12
at next (/code/mocha/lib/runner.js:594:14)
at /code/mocha/lib/runner.js:604:7
at next (/code/mocha/lib/runner.js:486:14)
at Immediate._onImmediate (/code/mocha/lib/runner.js:572:5)</failure></testcase>
</testsuite>
Validate XML
$ xmllint xunit.xml
xunit.xml:2: parser error : xmlParseCharRef: invalid xmlChar value 27
classname="" name="Test contains ANSI escape sequences" time="0"><failure>
^
xunit.xml:2: parser error : xmlParseCharRef: invalid xmlChar value 27
contains ANSI escape sequences" time="0"><failure>[31mthis is not ok
^
xunit.xml:9: parser error : xmlParseCharRef: invalid xmlChar value 27
AssertionError [ERR_ASSERTION]: [31mthis is not ok[39m
^
xunit.xml:9: parser error : xmlParseCharRef: invalid xmlChar value 27
AssertionError [ERR_ASSERTION]: [31mthis is not ok[39m
^
Exception: xmllint exited with 1
[tty 13], line 1: xmllint xunit.xml
Expected behavior: xunit.xml contains valid XML
Actual behavior: the XML contains the invalid character 
Reproduces how often: 100%
Versions
- The output of
mocha --versionandnode node_modules/.bin/mocha --version: 8.2.1 - The output of
node --version: v10.22.0 - Your operating system
- name and version: Linux b973b6e6ac1a 4.19.76-linuxkit #1 SMP Tue May 26 11:42:35 UTC 2020 x86_64 Linux
- Your shell (e.g., bash, zsh, PowerShell, cmd): elvish
Some additional info: https://www.w3.org/TR/xml/#charsets
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
#x1B is not in that range and needs to be removed
Should we remove #x1B or can we escape #x1B?
I just found your PR. I will check it.
How about replacing he.encode(String(html), {useNamedReferences: false}) with the following as escXML(String(html))?
var XML_REMAP = {
'<': '<', '>': '>', '&': '&', '"': '"', "'": ''',
"\u0000": "^nul", "\u0001": "^soh", "\u0002": "^stx", "\u0003": "^etx",
"\u0004": "^eot", "\u0005": "^enq", "\u0006": "^ack", "\u0007": "^bel",
"\u0008": "^bs", "\u000B": "^vt", "\u000C": "^np", "\u000E": "^so",
"\u000F": "^si", "\u0010": "^dle", "\u0011": "^dc1", "\u0012": "^dc2",
"\u0013": "^dc3", "\u0014": "^dc4", "\u0015": "^nak", "\u0016": "^syn",
"\u0017": "^etb", "\u0018": "^can", "\u0019": "^em", "\u001A": "^sub",
"\u001B": "^esc", "\u001C": "^fs", "\u001D": "^gs", "\u001E": "^rs",
"\u001F": "^us", "\u007F": "^del"
};
var XML_BAD_CHAR = /[&"<>'\u0000-\u0008\u000B\u000C\u000E-\u001F\u007f-\u0084\u0086-\u009f\uFDD0-\uFDEF\uFFFE\uFFFF]/g;
function escXML (s) {
return s.replace(XML_BAD_CHAR, function (c) {
return (XML_REMAP[c] || "^bad");
});
}
This would preserve the original data (except for bad unicode sequences) and be a more general solution.
I'm happy to incorporate this into #4527 (which has been auto-closed, unfortunately)