tinyxml2 icon indicating copy to clipboard operation
tinyxml2 copied to clipboard

GetText() on element containing only whitespace characters returns null

Open PetrDobes opened this issue 12 years ago • 5 comments

Not sure if it's a bug or feature. Yet it can then be a little bit tricky when you really need to work with the exact text within an element.

Code:

std::string str = "<textData>example</textData>";

tinyxml2::XMLDocument doc;
doc.Parse(str.c_str());
doc.FirstChildElement()->GetText();

Works as expected with:

str = "<textData>example</textData>";
 // GetText() gets: "example"
str = "<textData>   example   </textData>";
 // GetText() gets: "   example   "

But returns NULL when there are only whitespace characters inside the element:

str = "<textData>       </textData>";
 // GetText() returns NULL

PetrDobes avatar Apr 27 '13 20:04 PetrDobes

Feature, in the broad sense, in that TinyXML-2 skips all whitespace elements. In this case, it's obvious that the whitespace is intended (to a human reading the file). On the other hand:

<textData>  
</textData>

Isn't clear - should that newline be preserved? Extra trailing spaces?

I think eliminating all whitespace areas is usually the correct choice, but I'd be open to an algorithm that tried to be a little smarter, if it doesn't make the rules too weird.

leethomason avatar Apr 28 '13 21:04 leethomason

Closing for now. May reopen if a good strategy to handle whitespace nodes is proposed.

leethomason avatar Apr 30 '13 22:04 leethomason

An option over whether leaf nodes with all whitespace are preserved or collapsed seems like a reasonable compromise approach. That's the essential difference between the different kinds of whitespace in fragments like the following.

<body>
  <text>   </text>
</body>`

It also lets the user choose how to treat

<textData>
</textData>

mabraham avatar Mar 31 '16 01:03 mabraham

Re-opening. I don't know if we'll ever make everyone happy on this issue, but this is a nice summary.

leethomason avatar Mar 31 '16 22:03 leethomason

@leethomason is there any plan to fix this issue? I think in both cases, we should preserve the space if PRESERVE_WHITESPACE flag is set.

yaozongyou avatar Jul 01 '19 12:07 yaozongyou