go2xunit icon indicating copy to clipboard operation
go2xunit copied to clipboard

Invalid xml characters

Open eloo opened this issue 6 years ago • 6 comments

Hi, i was just integrating your tool in my build pipeline and sadly its not working for my integration tests because the console output is not escaped and so the cdata holds invalid characters. Maybe they can be easily stripped?

Here is the error thrown by jenkins

org.dom4j.DocumentException: Error on line 7 of document  : An invalid XML character (Unicode: 0x1b) was found in the CDATA section. Nested exception: An invalid XML character (Unicode: 0x1b) was found in the CDATA section.

and here is a snippet of my xml with the error

<?xml version="1.0" encoding="UTF-8"?>

  <testsuite name="my-test-suite-name" tests="6" errors="0" failures="4" skip="0">
    <testcase classname="my-test-class-name" name="my-test-name" time="11.07">

      <failure type="go.error" message="error">
        <![CDATA[[36m[2019-09-09 10:53:30][0m [32m INFO[0m Try to connect gelf logger to: host= port= token=              

if you past this into https://www.xmlvalidation.com you will see the error.

I guess the invalid characters are coming from the ansii output but nevertheless the generated xml should be parsable.

Thanks

eloo avatar Sep 09 '19 09:09 eloo

Thanls @eloo, Can you provide the input to go2xunit?

tebeka avatar Sep 12 '19 06:09 tebeka

here is a snippet of the .out file which is produced by go test

integration-tests.out.txt

eloo avatar Sep 12 '19 09:09 eloo

After some investigation, I don't really know what to do here. According to this article you can't have anything in CDATA section. However the string itself is a valid UTF-8 string.

I've tried using strings.Map to leave only valid characters and still got nowhere.

Will gladly hear some new ideas

tebeka avatar Sep 25 '19 03:09 tebeka

@tebeka as fas as i know the problem comes from the ansi sequences.. maybe you can try something like this? https://github.com/acarl005/stripansi

eloo avatar Sep 25 '19 06:09 eloo

I'm not sure I want to do this in the general case since it might delete data people use. Maybe a flat to clean it up? Need to think about it.

tebeka avatar Sep 25 '19 15:09 tebeka

but this should only strip ansi characters which are nevertheless useless in xml :D

you can also check this project https://github.com/jstemmer/go-junit-report for reference, because this looks to work with ansi output

eloo avatar Sep 26 '19 08:09 eloo