dxml icon indicating copy to clipboard operation
dxml copied to clipboard

dom: Entities consisting of whitespace do not capture their contents

Open GooberMan opened this issue 6 years ago • 2 comments

<Text> </Text>

Single space there. The <Text> DOMEntry's children are null.

I tried providing a config that defined SplitEmpty as yes. No dice.

These whitespaces are important in a schema I'm attempting to parse. And even if they weren't, I'd expect it would be up to the user to determine how to handle the text contents inside any given entry.

GooberMan avatar Aug 09 '18 22:08 GooberMan

It is very much by design that there are no EntityType.text entities which contain only whitespace, since it becomes an utter disaster to parse formatted XML if whitespace by itself is treated as data. I don't know what it would take to add an option to leave such whitespace in. I will have to look into it.

jmdavis avatar Aug 09 '18 22:08 jmdavis

Sticking to standards, schema support so that I can use xsd:whiteSpace would be the best option I'd imagine.

GooberMan avatar Aug 09 '18 23:08 GooberMan