node-xmlsplit
node-xmlsplit copied to clipboard
Splitting on tags more than 1 level deep confuses xmlsplit
XML File
I have the following small XML file
<?xml version="1.0" encoding="utf-8"?>
<Outer>
<Inner attr="xxx">
<A>1</A>
</Inner>
<Inner otherattr="yyy">
<A>2-0</A>
<A>2-1</A>
<A>2-2</A>
<A>2-3</A>
</Inner>
<Inner>
<A>
<B attr="AA"/>
<C>
<D Dattr="Value"/>
</C>
</A>
</Inner>
</Outer>
Program
And the following file
import fs from 'fs';
const XmlSplit = require('xmlsplit');
const xmlsplit = new XmlSplit(1, 'A'); // Splitting on Tag <A>
const CHUNK_SIZE = 200; // bytes
const xmlfile = 'Test.xml';
async function start() {
const stream = fs.createReadStream(xmlfile, { highWaterMark: CHUNK_SIZE});
stream.pipe(xmlsplit).on('data', function(data: any) {
const xmlDocument = data.toString();
console.log(xmlDocument);
console.log('--------------------------------------')
});
}
start();
Expected output
You would expect different XML documents with A-tags, either
<Outer>
<Inner>
<A>
...
</A>
<Inner>
</Outer
or an XML without the Inner tag.
Realized output
But XmlSplit return the following:
<?xml version="1.0" encoding="utf-8"?>
<Outer>
<Inner attr="xxx">
<A>1</A></Outer>
--------------------------------------
<?xml version="1.0" encoding="utf-8"?>
<Outer>
<Inner attr="xxx">
</Inner>
<Inner otherattr="yyy">
<A>2-0</A></Outer>
--------------------------------------
<?xml version="1.0" encoding="utf-8"?>
<Outer>
<Inner attr="xxx">
<A>2-1</A></Outer>
--------------------------------------
<?xml version="1.0" encoding="utf-8"?>
<Outer>
<Inner attr="xxx">
<A>2-2</A></Outer>
--------------------------------------
<?xml version="1.0" encoding="utf-8"?>
<Outer>
<Inner attr="xxx">
<A>2-3</A></Outer>
--------------------------------------
<?xml version="1.0" encoding="utf-8"?>
<Outer>
<Inner attr="xxx">
</Inner>
<Inner>
<A>
<B attr="AA"/>
<C>
<D Dattr="Value"/>
</C>
</A></Outer>
--------------------------------------
If you look at the output returned you can see that in several instances the process gets confused.