CFPropertyList icon indicating copy to clipboard operation
CFPropertyList copied to clipboard

`CF$UID` cannot be squashed to integer

Open kiler129 opened this issue 3 years ago • 2 comments

Describe the bug Parsing plists produced by NSKeyedArchiver produces a tree of dependencies where references are marked with CF$UID pointing to the main tree. The library squashes these into integers making it impossible to distinguish references from real values.

To Reproduce Attempt to parse the following plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>test1</key>
        <dict>
            <key>CF$UID</key>
            <integer>123</integer>
        </dict>
        <key>test2</key>
        <integer>321</integer>
    </dict>
</plist>

As an array it will result in: image

This doesn't seem like a problem until you encounter a structure like:

<dict>
    <key>$class</key>
    <dict>
        <key>CF$UID</key>
        <integer>89</integer>
    </dict>

    <key>accumulatedDataSize</key>
    <integer>7723067</integer>

    <key>allURLs</key>
    <dict>
        <key>CF$UID</key>
        <integer>8</integer>
    </dict>
</dict>

This will parse to: image

As you can see it's impossible to distinguish what's is a reference and what is an integer. Obviously these are two different things. I traced the origin of this to https://github.com/TECLIB/CFPropertyList/commit/2ea0483806c989eb0518a767fa29a111bb29cb67 and no further changes has been made.

Expected behavior UID type should be decoded to a simple DTO which can have a toInteger() method.

kiler129 avatar Feb 15 '22 19:02 kiler129

@btry: can you maybe shed some light on this one? In the current state the data is being unavoidably lost while a plist is parsed and without a custom fork some plists representations are unusable. I can prepare a PR but I'm not sure what's the solution you guys at TECLIB will see as acceptable.


As a stop-gap a dirty workaround can be implemented without editing the library, which essentially reverts CFUid to CFDictionary while the plist is parsed. It's a dirty hack but it works. It has the downside of adding an extra ghost key to every CFUid dictionary (since there's no sane way to prevent import() => case 'dict' behavior);

/**
 * Fixes CFUid squashing to int, which causes data loss
 *
 * This exists solely as a workaround for https://github.com/TECLIB/CFPropertyList/issues/71
 */
class StrictCFPropertyList extends CFPropertyList
{
    private const CFUID_REF_KEY = 'CF$UID';
    private const CFUID_BUG_KEY = 'TL$BUG#71';

    protected function import(\DOMNode $node, $parent)
    {
        parent::import($node, $parent);

        //Essentially prevents magical conversion of CFDictionary to CFUid (which silently casts to int) for plain XMLs
        if ($parent instanceof CFDictionary) {
            $pv = $parent->getValue();
            if (isset($pv[self::CFUID_REF_KEY]) && \count($pv) === 1) {
                $parent->add(self::CFUID_BUG_KEY, new CFBoolean(true));
            }
        }
    }

    public function readBinaryObject()
    {
        $ret = parent::readBinaryObject();

        //Generates the CFDictionary back after it was squashed to CFUid for binary property lists
        if ($ret instanceof CFUid) {
            $ret = new CFDictionary([
                self::CFUID_REF_KEY => new CFNumber($ret->getValue()),
                self::CFUID_BUG_KEY => new CFBoolean(true)
            ]);
        }

        return $ret;
    }
}

Alternatively, a dirty reflection hack can be also implemented before the library is fixed, which doesn't cause any new keys to be added and ensures proper order is kept:

use CFPropertyList\CFUid;

class StrictCFUid extends CFUid
{
    public function toArray()
    {
        return ['CF$UID' => $this->getValue()];
    }
}
class StrictCFPropertyList extends CFPropertyList
{
    public function parseBinary($content = null)
    {
        parent::parseBinary($content);
        $this->fixSquashedCFUid($this);
    }

    protected function import(\DOMNode $node, $parent)
    {
        parent::import($node, $parent);
        if ($parent === $this) {
            $this->fixSquashedCFUid($this);
        }
    }

    private function fixSquashedCFUid(iterable $iter): void
    {
        $propRefs = [];

        foreach ($iter as $key => $item) {
            if (is_iterable($item)) {
                $this->fixSquashedCFUid($item);
            } elseif ($item instanceof CFUid) {
                $newCFUid = new StrictCFUid($item->getValue());
                if ($iter === $this) {
                    $this->value[$key] = $newCFUid;
                } elseif ($iter instanceof CFDictionary || $item instanceof CFArray) {
                    //CFDictionary has del() and add() but the setValue() is noop-ed, so there's no other way to swap it
                    // without altering the order
                    //CFArray has absolutely no way to replace a key and there order is crucial
                    $propRefs[$iter::class] ??= (new \ReflectionClass($iter::class))->getProperty('value');
                    $propRefs[$iter::class]->setAccessible(true);
                    $val = $propRefs[$iter::class]->getValue($iter);
                    $val[$key] = $newCFUid;
                    $propRefs[$iter::class]->setValue($iter, $val);
                } else {
                    throw new \RuntimeException('Unknown type ' . $iter::class);
                }
            }
        }
    }
}

Obviously this is not a long-term solution, but it's better than a custom fork.

kiler129 avatar Mar 03 '22 19:03 kiler129

Hi

I'm busy for a while; I'll try to find some time to investigate on it.

btry avatar Mar 07 '22 10:03 btry