cockpit icon indicating copy to clipboard operation
cockpit copied to clipboard

collection query populate depth is off by 1

Open beasteers opened this issue 3 years ago • 4 comments

tl;dr: change here:

if (is_numeric($maxlevel) && $maxlevel > -1 && $level > ($maxlevel+1)) {
// to
if (is_numeric($maxlevel) && ($maxlevel == 0 || $maxlevel > -1 && $level > $maxlevel)) {

What is this about?

This is in reference to the populate=1 query argument that is available when querying collection items that contain a CollectionLink field.

When looking into the source code, I noticed that the value of populate=1 also doubles as a max depth parameter, meaning that you can pass populate=2 which will populate 1 depth deeper (populate=3 etc.).

Here's the function signature that shows that max depth is used: https://github.com/agentejo/cockpit/blob/568b0124352f6d27df359e8c19a70d2dd1961e87/modules/Collections/bootstrap.php#L628

And where it is called and gets passed the populate parameter.: https://github.com/agentejo/cockpit/blob/568b0124352f6d27df359e8c19a70d2dd1961e87/modules/Collections/bootstrap.php#L269

So what's the issue?

The problem is that the code is currently structured so that the max depth parameter uses 0 to signify a depth of 1 and 1 to signify a depth of 2 and on top of being a bit confusing, it's problematic when you're also using it as a boolean.

used as a boolean to determine if we should populate here: https://github.com/agentejo/cockpit/blob/568b0124352f6d27df359e8c19a70d2dd1961e87/modules/Collections/bootstrap.php#L268

So what that means is that if you pass populate=0, nothing gets populated, and if you pass populate=1, two depths get populated (populate=2 is 3 depths).

Here's an example of the response format where I have a cyclical relationship between two collections:

# populate=0
{'A': {'_id': '60d0888d5e5e99226a588871', 'link': 'A'}}

# populate=1  <<< this is annoying and is the reason I started digging into this
{'A': {'B': {'A': {'_id': '60d0888d5e5e99226a588871', 'link': 'A'}}}}

# populate=2
{'A': {'B': {'A': {'B': {'_id': '6006307242da5d4b7f5e35c1', 'link': 'B'}}}}}

# populate=3
{'A': {'B': {'A': {'B': {'A': {'_id': '60d0888d5e5e99226a588871', 'link': 'A'}}}}}}

What's the solution

Well the most intuitive thing I can think of would be to make $maxlevel=0 mean don't populate, $maxdepth=1 be depth of 1, etc., and keep $maxlevel=-1 to mean populate everything.

Here's a draft solution

function cockpit_populate_collection(&$items, $maxlevel = -1, $level = 0, $fieldsFilter = []) {
    if (!is_array($items)) {
        return $items;
    }

    if (is_numeric($maxlevel) && ($maxlevel == 0 || $maxlevel > -1 && $level > $maxlevel)) { // this
        return $items;
    }
    ...
}

Compared to the original: https://github.com/agentejo/cockpit/blob/568b0124352f6d27df359e8c19a70d2dd1961e87/modules/Collections/bootstrap.php#L628-L636

So in the updated version that means:

  • $items (array of associative arrays): maxlevel=1, level=0, level > maxlevel == false, keys = [0, 1, 2...]
    • no links are resolved because level > 0 == false
  • $item (single associative array): maxlevel=1, level=1, level > maxlevel == false, keys = [val1, linked, ...]
    • the key linked is resolved because it is a collectionlink field
  • $item[linked] (single associative array): maxlevel=1, level=2, level > maxlevel == true, exit early because of maxlevel

beasteers avatar Jun 21 '21 23:06 beasteers

And actually, I was playing around with things and ended up in infinite recursion when populate == true (which equates to maxlevel == -1) for that cyclical case above

So maybe something like this as well could help prevent recursion.

function cockpit_populate_collection(&$items, $maxlevel = -1, $level = 0, $fieldsFilter = [], &$stack = []) {
    if (!is_array($items)) {
        return $items;
    }

    if (is_numeric($maxlevel) && ($maxlevel == 0 || $maxlevel > -1 && $level > $maxlevel)) {
        return $items;
    }

    foreach ($items as $k => &$v) {
        if ($level > 0 && is_array($v) && isset($v['_id'], $v['link'])) {
            $id = $v['_id'];
            $link = $v['link'];
            if (in_array([$id, $link], $stack)) {  // if _id is globally unique, then we can drop $link
                continue
            }
            $items[$k] = cockpit('collections')->_resolveLinkedItem($v['link'], (string)$v['_id'], $fieldsFilter);
            $items[$k]['_link'] = $link;
            $stack[] = [$id, $link];
        }
        $items[$k] = cockpit_populate_collection($items[$k], $maxlevel, ($level + 1), $fieldsFilter, $stack);
    }
    return $items;
}

beasteers avatar Jun 22 '21 02:06 beasteers

Thanks for your effort! Could you please create a PR?

aheinze avatar Jun 22 '21 11:06 aheinze

I can try maybe late this week or next, but I'll need to find an extended window for testing and whatnot.

Quick question - is '_id' unique across all collections? Just cuz I'd prefer in_array($id, $stack) rather than in_array([$id, $link], $stack) because the former is probs faster

beasteers avatar Jun 22 '21 18:06 beasteers

@aheinze K all set. So this issue can be resolved with #1459 which is pretty simple and doesn't touch much of the codebase (basically a single function and its single invocation place). I know how overburdened you are so I understand if this takes a little time to get pushed through.

beasteers avatar Jul 27 '21 13:07 beasteers