CFPropertyList
CFPropertyList copied to clipboard
`CF$UID` cannot be squashed to integer
Describe the bug
Parsing plists produced by NSKeyedArchiver
produces a tree of dependencies where references are marked with CF$UID
pointing to the main tree. The library squashes these into integers making it impossible to distinguish references from real values.
To Reproduce Attempt to parse the following plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>test1</key>
<dict>
<key>CF$UID</key>
<integer>123</integer>
</dict>
<key>test2</key>
<integer>321</integer>
</dict>
</plist>
As an array it will result in:
This doesn't seem like a problem until you encounter a structure like:
<dict>
<key>$class</key>
<dict>
<key>CF$UID</key>
<integer>89</integer>
</dict>
<key>accumulatedDataSize</key>
<integer>7723067</integer>
<key>allURLs</key>
<dict>
<key>CF$UID</key>
<integer>8</integer>
</dict>
</dict>
This will parse to:
As you can see it's impossible to distinguish what's is a reference and what is an integer. Obviously these are two different things. I traced the origin of this to https://github.com/TECLIB/CFPropertyList/commit/2ea0483806c989eb0518a767fa29a111bb29cb67 and no further changes has been made.
Expected behavior
UID type should be decoded to a simple DTO which can have a toInteger()
method.
@btry: can you maybe shed some light on this one? In the current state the data is being unavoidably lost while a plist is parsed and without a custom fork some plists representations are unusable. I can prepare a PR but I'm not sure what's the solution you guys at TECLIB will see as acceptable.
As a stop-gap a dirty workaround can be implemented without editing the library, which essentially reverts CFUid
to CFDictionary
while the plist is parsed. It's a dirty hack but it works. It has the downside of adding an extra ghost key to every CFUid dictionary (since there's no sane way to prevent import() => case 'dict'
behavior);
/**
* Fixes CFUid squashing to int, which causes data loss
*
* This exists solely as a workaround for https://github.com/TECLIB/CFPropertyList/issues/71
*/
class StrictCFPropertyList extends CFPropertyList
{
private const CFUID_REF_KEY = 'CF$UID';
private const CFUID_BUG_KEY = 'TL$BUG#71';
protected function import(\DOMNode $node, $parent)
{
parent::import($node, $parent);
//Essentially prevents magical conversion of CFDictionary to CFUid (which silently casts to int) for plain XMLs
if ($parent instanceof CFDictionary) {
$pv = $parent->getValue();
if (isset($pv[self::CFUID_REF_KEY]) && \count($pv) === 1) {
$parent->add(self::CFUID_BUG_KEY, new CFBoolean(true));
}
}
}
public function readBinaryObject()
{
$ret = parent::readBinaryObject();
//Generates the CFDictionary back after it was squashed to CFUid for binary property lists
if ($ret instanceof CFUid) {
$ret = new CFDictionary([
self::CFUID_REF_KEY => new CFNumber($ret->getValue()),
self::CFUID_BUG_KEY => new CFBoolean(true)
]);
}
return $ret;
}
}
Alternatively, a dirty reflection hack can be also implemented before the library is fixed, which doesn't cause any new keys to be added and ensures proper order is kept:
use CFPropertyList\CFUid;
class StrictCFUid extends CFUid
{
public function toArray()
{
return ['CF$UID' => $this->getValue()];
}
}
class StrictCFPropertyList extends CFPropertyList
{
public function parseBinary($content = null)
{
parent::parseBinary($content);
$this->fixSquashedCFUid($this);
}
protected function import(\DOMNode $node, $parent)
{
parent::import($node, $parent);
if ($parent === $this) {
$this->fixSquashedCFUid($this);
}
}
private function fixSquashedCFUid(iterable $iter): void
{
$propRefs = [];
foreach ($iter as $key => $item) {
if (is_iterable($item)) {
$this->fixSquashedCFUid($item);
} elseif ($item instanceof CFUid) {
$newCFUid = new StrictCFUid($item->getValue());
if ($iter === $this) {
$this->value[$key] = $newCFUid;
} elseif ($iter instanceof CFDictionary || $item instanceof CFArray) {
//CFDictionary has del() and add() but the setValue() is noop-ed, so there's no other way to swap it
// without altering the order
//CFArray has absolutely no way to replace a key and there order is crucial
$propRefs[$iter::class] ??= (new \ReflectionClass($iter::class))->getProperty('value');
$propRefs[$iter::class]->setAccessible(true);
$val = $propRefs[$iter::class]->getValue($iter);
$val[$key] = $newCFUid;
$propRefs[$iter::class]->setValue($iter, $val);
} else {
throw new \RuntimeException('Unknown type ' . $iter::class);
}
}
}
}
}
Obviously this is not a long-term solution, but it's better than a custom fork.
Hi
I'm busy for a while; I'll try to find some time to investigate on it.