mongodb-odm Performance improvements when counting inverse collections

Performance improvements when counting inverse collections

Open alcaeus opened this issue 7 years ago • 3 comments

This was previously handled in #1086 for 1.x, but can't be done in 2.0 for repository methods (they are expected to return a cursor which no longer offers triggering a count query).

While this performance improvement can lead to situations where counting gives one results and a subsequent iteration over the collection yields more results than previously counted, it is definitely not a good idea to read (and hydrate) all results just to count them.

A workaround may be to load data into memory but not hydrate it until iteration starts. This would eliminate the time penalty of hydration when counting a collection, but may still cause large memory consumption when counting a large collection.

Dec 21 '17 06:12 alcaeus

Given repository method is to return an array of objects it might be hard to defer hydration. I'm inclined to move this into 2.x for now

Mar 14 '18 09:03 malarzm

In 1.x, repository method is expected to return a cursor, not an array of objects. The problem in 2.0 arises because a cursor can no longer be counted, so we'd have to take the cursor, get its query information and run the appropriate count query. Personally, the memory issue is a big problem, as the only other option to solve this is to manually keep track of the number of associated referenced items.

Mar 14 '18 16:03 alcaeus

That actually should be doable, PersistentCollection would need to have its count() method tweaked.

Mar 14 '18 16:03 malarzm

mongodb-odm mongodb-odm copied to clipboard

Performance improvements when counting inverse collections

mongodb-odm
mongodb-odm copied to clipboard