fluidxml
fluidxml copied to clipboard
XML special chars break the document
Hello. I got "Invalid Character Error" exception when i had tried to make XML document from db data because one of records contained "<" symbol. Then i have been required to check every xml node in my code for special chars and wrap it in CDATA block manually. I think the data should be escaped or CDATA-wrapped automatically by default. Thanks!
\FluidXml\fluidxml()->add('root', '12345')->xml(); // ok \FluidXml\fluidxml()->add('root', '<12345>')->xml(); // error
I encountered a similar problem when trying to ->add()
and the string contained the character &
:
DOMElement::__construct(): unterminated entity reference ...
I was able get around it (kind of) by wrapping the string in htmlspecialchars()
.
I haven't tested it but it might be possible to use the XML Fragment method instead -- @EligiusSantori this might be easier than wrapping with CDATA blocks.
@daniele-orlando , if the XML Fragment works would it be out of scope to ask that fluidxml catches this error and automatically attempts to use the fragment method?
Guys, the ampersand &
like <
, >
and other characters must be escaped in XML. Nothing surprising here.
You can use CDATA blocks (they exist for this) or convert the special characters to a valid XML representation.
XML Fragments are a different thing, they are valid XML, not just a trick to escape invalid characters.
I'll evaluate for the next release (2.0) to catch the exception and convert any special character to a valid XML representation.
Hi, sorry about being offtopic but the usage of ->cdata()
is not clear to me.
How would one go with adding CDATA to multiple elements using chaining methods?
$xml
->add('property', true)
->add('id', $property->id)
->add('url')->cdata($property->url_web) // NOPE
->add('title', true)->cdata($property->title) // NOPE NOPE
Thanks in advance.
If you want to stay fluent:
$xml->add('property', true)
->add('id', $property->id)
->add('url', true)
->cdata($property->url_web)
->query('..')
->add('title', true)
->cdata($property->title);
Otherwise:
$prop = $xml->add('property', true);
$prop->add('id', $property->id);
$prop->add('url', true)->cdata($property->url_web);
$prop->add('title', true)->cdata($property->title);
Hi, any update on this? Please escape characters correctly:
// Breaks
$fluid_xml->add([
'Property' => [
'ID' => 1234,
'Test' => 'Hello & World'
]
]);
As I'm working from an array like this, I can't easily change this to use your ->cdata() method (the real code is more complicated).
This is quite a basic feature, I'm surprised it's missing. This potential bug could easily be missed and then break in production the first time a user types an &
into your system somewhere.
I don't see the point of not escaping by default. What use cases does not escaping cover?
Not that what you laid out here will not work, but I think it about the features of the library. I found this library search for a way to convert an array to XML, which the library does well except for CDATA objects. I see this function...
public function addChild($child, ...$optionals) { return $this->chooseContext(function($cx) use ($child, &$optionals) { return $cx->addChild($child, ...$optionals); }); }
...and it seems there is no way to add a CDATA object adding an array to a node. It's not deal breaker, but it would be nice to have.
So this is still an issue and doesn't seem like it will ever be fixed. I've forked it to fix it (messily)
on addChild
for a simple string you can optionally use htmlentities() to encode your string
Usage
<?php
$options = [
'htmlentities' => $args, // use to optimistically encode (always run value through htmlentities())
'exceptionHtmlentities' => $args // use to encode and recreate element only if an exception with a message containing 'unterminated entity reference' is thrown
];
// only one option can be used at a time -- if you use htmlentities then exceptionHtmlentities will be ignored
$args = true; // use htmlentities() without args
$args = array(); // spread the array as arguments EX htmlentities($value, ...$args)
// use as defaults
$fluidDoc = new FluidXml(null, ['htmlentities' => $args]);
// use on addChild()
$fluidDoc->addChild('MyNode', 'someValue', ['htmlentities' => $args]);
//EX using as defaults, only on exception, with some htmlentities arguments
$fluidDoc = FluidXml(null, ['exceptionHtmlentities' => [(ENT_NOQUOTES | ENT_SUBSTITUTE)]]);
fork is at https://github.com/Fulfillment-dot-com/fluidxml/tree/specialchars
ADDITIONALLY: my branch moved composer.json
to the root of the directory because cloning from git on windows may not respect symlinks and honestly the only reason it isn't already like this is because the author has some vim issue or something -- its a personal preference that's been foisted on anyone trying to develop on this repo.
So to use this fork change your composer.json
like so:
{
"require": {
...
"servo/fluidxml": "dev-specialchars",
},
"repositories": [
...
{
"type": "git",
"url": "https://github.com/Fulfillment-dot-com/fluidxml.git"
}
],
}
P.S. It's a big bummer that so many members of FluidXml class are all private
instead of protected
. This could easily have been solved by extending FluidXml
to use a modified FluidInsertionHandler
but instead I had to do this whole fork just to manage it :/
Use htmlspecialchars()