dxml
dxml copied to clipboard
dom: Entities consisting of whitespace do not capture their contents
<Text> </Text>
Single space there. The <Text> DOMEntry's children are null.
I tried providing a config that defined SplitEmpty as yes. No dice.
These whitespaces are important in a schema I'm attempting to parse. And even if they weren't, I'd expect it would be up to the user to determine how to handle the text contents inside any given entry.
It is very much by design that there are no EntityType.text
entities which contain only whitespace, since it becomes an utter disaster to parse formatted XML if whitespace by itself is treated as data. I don't know what it would take to add an option to leave such whitespace in. I will have to look into it.
Sticking to standards, schema support so that I can use xsd:whiteSpace would be the best option I'd imagine.