tinyxml2 icon indicating copy to clipboard operation
tinyxml2 copied to clipboard

XML_ERROR_PARSING when parsing files that was written for old TinyXML

Open Xottab-DUTY opened this issue 7 years ago • 3 comments

When I'm trying to read string like this with old TinyXML everything is OK:

	<string id="mp_gp_unique_nick_must_contain_only">
		<text>The unique nickname may only contain numbers, Latin letters and the following special symbols: "#$&'()*+-./:;<=>?@[]^_`{|}~</text>
	</string>

But, when I'm trying to read the same string it fails to parse it. It fails because of these entities: "#$&'()*+-./:;<=>?@[]^_{|}~` What thing should be changed or switched in TinyXML2 to accept this string? Or should I move back to old TinyXML?

Codepage for XML files may vary from Win-1250 to Win-1252 and others. I also can't modify XML files, only read them. That's why I need something to do with the parser itself.

Xottab-DUTY avatar Dec 26 '18 18:12 Xottab-DUTY

TL;DR The < must be escaped with a &lt; entity, since it is assumed to be the beginning of a tag. The & must be escaped with a &amp; entity, since it is assumed to be the beginning a entity reference The > should be escaped with &gt; entity. It is not mandatory -- it depends on the context -- but it is strongly advised to escape it. The ' should be escaped with a &apos; entity -- mandatory in attributes defined within single quotes but it is strongly advised to always escape it. The " should be escaped with a &quot; entity -- mandatory in attributes defined within double quotes but it is strongly advised to always escape it.

Long version of TL;DR: XML 1.0 allowed characters list: https://www.w3.org/TR/xml/#charsets https://www.w3.org/TR/xml/#dt-escape

XML 1.1 allowed and restricted characters list: https://www.w3.org/TR/xml11/#charsets https://www.w3.org/TR/xml11/#dt-escape

XakepSDK avatar Dec 26 '18 18:12 XakepSDK

This one should work fine:

<string id="mp_gp_unique_nick_must_contain_only">
<text>The unique nickname may only contain numbers, Latin letters and the following special symbols: "#$&amp;'()*+-./:;&lt;=>?@[]^_`{|}~</text>
</string>

XakepSDK avatar Dec 26 '18 19:12 XakepSDK

I already said that we should not change the XML files.

Xottab-DUTY avatar Dec 26 '18 19:12 Xottab-DUTY