lithium icon indicating copy to clipboard operation
lithium copied to clipboard

Use Affected Rows in Collection for counting

Open shopblocks opened this issue 9 years ago • 4 comments

Performing a count on a RecordSet is very memory intensive.

Performing a count on 20,000 rows using

<?php count($result_set);

I can see that it uses the util/Collection class, which implements count like this:

<?php

public function count() {
    $count = iterator_count($this);
    $this->rewind();
    return $count;
}

I believe the slowness is caused by the iterator_count call, which loops over all of the records.

How come something like this can't be done:

<?php
public function count() {
    return count($this->_data);
}

A quick way to recreate the memory issues is to do the following:

<?php 
public function test_record_set()
    {
        $memory_start = memory_get_usage();

        $records = [];

        $num = 20000;

                foreach (range(1, $num) as $key => $value) {
                    $records[] = new Record(['data' => ['id' => $value]]);
                }

                $record_set = new RecordSet([
                    'data' => $records,
                ]);

        \app\libraries\utils\Debug::dd('Memory Used: ' . ( memory_get_usage() - $memory_start ));
    }

shopblocks avatar Jun 14 '16 14:06 shopblocks

Trying to count 20k objects in memory is a bad idea. It's better to run a separate query with the same conditions explicitly to get the count. RecordSet fetches records lazily and stores them in $this->_data, so it has to iterate through the entire set before $this->_data would be any good for counting anyway.

nateabele avatar Jun 14 '16 18:06 nateabele

Should $this->_data not be keeping an internal counter perhaps, so that if that is available, count($object) returns the already-calculated value?

shopblocks avatar Jun 17 '16 09:06 shopblocks

As @nateabele said, for your use case it's best to use a count query. You never want that many records in a result set at once. At one point you'll start processing them in some way.

I agree with @shopblocks that Collection could be optimized in a way: create a FixedCollection that allows us to pass a count manually, which will be returned by it instead of looking at it's items. The count would come from http://php.net/manual/de/pdostatement.rowcount.php and the MongoDB equivalent.

mariuswilms avatar Jun 17 '16 12:06 mariuswilms

Thats fine, we have changed our code to have a Model::count() query too, but this enhancement would still be useful.

shopblocks avatar Jun 17 '16 12:06 shopblocks