Internal query iterator's batch size limit is undefined (Firestore Datastore)
This library returns an iterator for queries. If no limit is specified, the iterator will continue until there are no results remaining e.g.
$datastore = new DatastoreClient();
$query = $datastore->query()
->kind('Task')
->filter('done', '=', false);
$iterator = $datastore->runQuery($query);
foreach ($iterator as $entity) {
...
}
What is the default value of limit if none is set? In other words, what is the max size of batches (pages) used by the internal iterator?
This is not documented in limits or in the Query reference.
Why this matters?
We need to know that the max value of limit so we can maximize throughput performance of batch processing of query results. There are some mentions of setBatchSize() in the Python library but this is not documented anywhere either. Additionally, we often need to batchInsert() n entities using results from the iterator and need to know the max size of each page.
Thank you
Hi @calsmith Thank you for your question!
The default to resultLimit is 0, or no limit. This can be seen here. So the iterator will iterate until there are no more results.
Looking at the Datastore Operation class for Operation::runQuery, it does not appear that the resultLimit is configurable. We can look into adding that, so then you'd be able to do something like this:
$transaction->runQuery($query, ['resultLimit' => 100]);
Is this something that you would find useful?
As far as I can tell, there is no way to configure results per page in Datastore.
@Hectorhammett What do you think? Adding you to this so we can make sure this is configurable for Firestore V2 (and also so we can add it to Datastore V2)