doc-en icon indicating copy to clipboard operation
doc-en copied to clipboard

xml_parser_create(_ns)() encoding param description is incorrect

Open Girgias opened this issue 3 years ago • 3 comments

The input encoding is only detected if the value passed is an empty string. If the encoding is null then the default encoding is used.

This behaviour dates from at least PHP 5.3. See source code:

  • PHP 5.3: https://heap.space/xref/PHP-5.3/ext/xml/xml.c?r=7d163e8a#1265
  • master: https://heap.space/xref/php-src/ext/xml/xml.c?r=94ee4f98#966

Girgias avatar Sep 13 '22 12:09 Girgias

I'm afraid this is more complex than that. For libxml2 builds (I don't know whether anybody still builds against libexpat; probably not, since libexpat 2 is not supported), the encoding parameter of XML_ParserCreate_MM() is ignored.

However, libxml2 has built-in encoding detection for some encodings, and this works regardless of the $encoding parameter (which is only used for output with libxml2).

cmb69 avatar Sep 13 '22 13:09 cmb69

sigh Okay, so it might make sense to deprecate not using libxml2 as the XML library? But this is a bit of a thorny issue.

Girgias avatar Sep 14 '22 09:09 Girgias

Yeah, but anyway, we need to document what we have so far. :)

cmb69 avatar Sep 14 '22 10:09 cmb69