DoctrineBatchUtils icon indicating copy to clipboard operation
DoctrineBatchUtils copied to clipboard

Add utility to iterate over a fetch-joined query

Open Ocramius opened this issue 5 years ago • 2 comments

As it currently stands, iteration over a DQL query like following will fail due to Query#iterate() not supporting it in ORM (https://github.com/doctrine/orm/issues/5868):

SELECT
    foo,
    bar
FROM
    MyEntity foo
LEFT JOIN
    foo.bars bar

In order to allow that, the ORM (luckily) comes with a relatively smart Paginator component.

I hacked together a prototype of how this could work in following (un-tested) snippet:

    public static function fromPaginator(
        Paginator $paginator,
        EntityManagerInterface $entityManager,
        int $batchSize
    ) : self {
        $currentBatchPosition = (int) $paginator->getQuery()->getFirstResult();
        $maxBatchPosition = min(((int) $paginator->getQuery()->getMaxResults()) + $currentBatchPosition, (int) $paginator->count());

        $nextPage = static function (
            Paginator $paginator,
            int $firstResult,
            int $maxResult
        ) : Paginator {
            $paginatorCopy = clone $paginator;

            $paginatorCopy->getQuery()->setFirstResult($firstResult);
            $paginatorCopy->getQuery()->setMaxResults($maxResult);

            return $paginatorCopy;
        };

        $iterator = (static function () use ($paginator, $currentBatchPosition, $maxBatchPosition, $batchSize, $nextPage) : \Generator {
            $cursorPosition = $currentBatchPosition;

            do {
                $pageWasEmpty = true;

                foreach ($nextPage($paginator, $cursorPosition, $batchSize) as $row) {
                    $pageWasEmpty = false;

                    yield $row;
                }

                if ($pageWasEmpty) {
                    return;
                }
                
                $cursorPosition += $batchSize;
            } while ($maxBatchPosition === 0 || $cursorPosition <= $maxBatchPosition);
        })();

        return self::fromTraversableResult($iterator, $entityManager, $batchSize);
    }

Two problems in the above:

  1. I hate how it is written: it's a mess
  2. unsure if it works for paginating over a query that already has well defined first/max results

We need accurate testing before considering it for inclusion.

Ocramius avatar Apr 22 '20 07:04 Ocramius

Thanks for the snippet, already pushed it in our 1.400 pods kubernetes cluster :+1:

Slamdunk avatar Apr 22 '20 07:04 Slamdunk

@Slamdunk anxiously waiting for a helm chart for it: no need for tests for that, just an MVP :P

Ocramius avatar Apr 22 '20 07:04 Ocramius