mocha 🐛 Bug: xunit reporter does not strip ansi escape sequences, leading to invalid XML

Prerequisites

[x] Checked that your issue hasn't already been filed by cross-referencing issues with the faq label
[x] Checked next-gen ES issues and syntax problems by using the same environment and/or transpiler configuration without Mocha to ensure it isn't just a feature that actually isn't supported in the environment in question or a bug in your code.
[x] 'Smoke tested' the code to be tested by running it outside the real test suite to get a better sense of whether the problem is in the code under test, your usage of Mocha, or Mocha itself
[x] Ensured that there is no discrepancy between the locally and globally installed versions of Mocha. You can find them with: node node_modules/.bin/mocha --version(Local) and mocha --version(Global). We recommend that you not install Mocha globally.

Description

When testing code that is throwing exceptions that have strings formatted with chalk or using ANSI escape codes in some other way, the resulting XML will contain the entity reference &x1B; which is an invalid character in XML.

Steps to Reproduce

Prepare minimal test file

const assert = require('assert');
const chalk = require('chalk');

it('Test contains ANSI escape sequences', () => {
 assert.ok(false, chalk.red('this is not ok'));
});

Run tests using mocha and xunit reporter

$  /code/.npm-global/bin/mocha -R xunit -O output=xunit.xml
Exception: /code/.npm-global/bin/mocha exited with 1
[tty 12], line 1: /code/.npm-global/bin/mocha -R xunit -O output=xunit.xml

Test fails as expected and creates xunit.xml

<testsuite name="Mocha Tests" tests="1" failures="0" errors="1" skipped="0" timestamp="Wed, 02 Dec 2020 17:59:42 GMT" time="0.003">
<testcase classname="" name="Test contains ANSI escape sequences" time="0"><failure>&#x1B;[31mthis is not ok&#x1B;[39m

      + expected - actual

      -false
      +true

AssertionError [ERR_ASSERTION]: &#x1B;[31mthis is not ok&#x1B;[39m
    at Context.it (test.js:5:9)
    at callFn (/code/mocha/lib/runnable.js:366:21)
    at Test.Runnable.run (/code/mocha/lib/runnable.js:354:5)
    at Runner.runTest (/code/mocha/lib/runner.js:677:10)
    at /code/mocha/lib/runner.js:801:12
    at next (/code/mocha/lib/runner.js:594:14)
    at /code/mocha/lib/runner.js:604:7
    at next (/code/mocha/lib/runner.js:486:14)
    at Immediate._onImmediate (/code/mocha/lib/runner.js:572:5)</failure></testcase>
</testsuite>

Validate XML

$ xmllint xunit.xml
xunit.xml:2: parser error : xmlParseCharRef: invalid xmlChar value 27
classname="" name="Test contains ANSI escape sequences" time="0"><failure>&#x1B;
                                                                               ^
xunit.xml:2: parser error : xmlParseCharRef: invalid xmlChar value 27
contains ANSI escape sequences" time="0"><failure>&#x1B;[31mthis is not ok&#x1B;
                                                                               ^
xunit.xml:9: parser error : xmlParseCharRef: invalid xmlChar value 27
AssertionError [ERR_ASSERTION]: &#x1B;[31mthis is not ok&#x1B;[39m
                                      ^
xunit.xml:9: parser error : xmlParseCharRef: invalid xmlChar value 27
AssertionError [ERR_ASSERTION]: &#x1B;[31mthis is not ok&#x1B;[39m
                                                              ^
Exception: xmllint exited with 1
[tty 13], line 1: xmllint xunit.xml

Expected behavior: xunit.xml contains valid XML

Actual behavior: the XML contains the invalid character 

Reproduces how often: 100%

Versions

The output of mocha --version and node node_modules/.bin/mocha --version: 8.2.1
The output of node --version: v10.22.0
Your operating system
- name and version: Linux b973b6e6ac1a 4.19.76-linuxkit #1 SMP Tue May 26 11:42:35 UTC 2020 x86_64 Linux
Your shell (e.g., bash, zsh, PowerShell, cmd): elvish

Dec 02 '20 18:12 trieloff

Some additional info: https://www.w3.org/TR/xml/#charsets

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

#x1B is not in that range and needs to be removed

Dec 02 '20 19:12 trieloff

Should we remove #x1B or can we escape #x1B?

Jan 03 '21 18:01 outsideris

I just found your PR. I will check it.

Jan 03 '21 18:01 outsideris

How about replacing he.encode(String(html), {useNamedReferences: false}) with the following as escXML(String(html))?

var XML_REMAP = {
    '<': '&lt;', '>': '&gt;', '&': '&amp;', '"': '&quot;', "'": '&#39;',
    "\u0000": "^nul", "\u0001": "^soh", "\u0002": "^stx", "\u0003": "^etx",
    "\u0004": "^eot", "\u0005": "^enq", "\u0006": "^ack", "\u0007": "^bel",
    "\u0008": "^bs",  "\u000B": "^vt",  "\u000C": "^np",  "\u000E": "^so",
    "\u000F": "^si",  "\u0010": "^dle", "\u0011": "^dc1", "\u0012": "^dc2",
    "\u0013": "^dc3", "\u0014": "^dc4", "\u0015": "^nak", "\u0016": "^syn",
    "\u0017": "^etb", "\u0018": "^can", "\u0019": "^em",  "\u001A": "^sub",
    "\u001B": "^esc", "\u001C": "^fs",  "\u001D": "^gs",  "\u001E": "^rs",
    "\u001F": "^us",  "\u007F": "^del"
};

var XML_BAD_CHAR = /[&"<>'\u0000-\u0008\u000B\u000C\u000E-\u001F\u007f-\u0084\u0086-\u009f\uFDD0-\uFDEF\uFFFE\uFFFF]/g;

function escXML (s) {

        return s.replace(XML_BAD_CHAR, function (c) {
                return (XML_REMAP[c] || "^bad");
        });
}

This would preserve the original data (except for bad unicode sequences) and be a more general solution.

Oct 20 '21 19:10 royfielding

I'm happy to incorporate this into #4527 (which has been auto-closed, unfortunately)

Oct 21 '21 15:10 trieloff

mocha mocha copied to clipboard

🐛 Bug: xunit reporter does not strip ansi escape sequences, leading to invalid XML

Prerequisites

Description

Steps to Reproduce

Versions

mocha
mocha copied to clipboard