fortune-mongodb icon indicating copy to clipboard operation
fortune-mongodb copied to clipboard

TODOs

Open gr0uch opened this issue 7 years ago • 15 comments

Mostly new features:

  • [x] Implement beginTransaction and endTransaction ~~using the $isolated operator. It doesn't provide rollbacks but it's better than nothing.~~ Using native implementation for MongoDB 4.0+

gr0uch avatar Jul 13 '17 08:07 gr0uch

MongoDB 4 now supports transactions. Does fortune mongodb support transactions and aggregations? Is it possible to get mongodb driver instance? How fortune find method works? Does it use multiple mongodb find commands or mongodb aggregation pipeline to get related documents?

ansarizafar avatar Oct 10 '18 10:10 ansarizafar

Ah yes, as you can see this was written before MongoDB 4. I'll look into implementing transactions. It does not use the aggregation pipeline. It seems to be useful for optimizing include option though.

You can get a reference to the underlying client by using this.client: https://github.com/fortunejs/fortune-mongodb/blob/deda6b72a9f8a512c533c8349ed23e3050e6d87a/lib/index.js#L38

gr0uch avatar Oct 10 '18 17:10 gr0uch

Mongodb aggregation is the new find, especially after the introduction of $lookup operator. Multiple find statements === multiple database roundtrips which means less performance.

ansarizafar avatar Oct 10 '18 17:10 ansarizafar

@ansarizafar multiple calls to find are not made unless used in conjunction with include option, that is why I mentioned it. Calling find once === one database roundtrip.

gr0uch avatar Oct 10 '18 17:10 gr0uch

I am talking about related documents which at the moment means multiple finds and multiple database roundtrips. I actually wanted to use https://github.com/genie-team/graphql-genie with Sapper https://sapper.svelte.technology/ for my new project. Graphql-genie uses fortunejs for data access. Graphql already has N+1 request problem and with nested queries, it becomes a huge performance issue. I request you to consider using mongodb aggregation pipeline with $lookup operator for all related document (include) queries. Please also add support for data aggregation.

ansarizafar avatar Oct 10 '18 19:10 ansarizafar

I'm not sure if the concern is warranted, I hope I'll try to explain it better. Fortune.js makes one roundtrip to the database (one call to find) per request, unless include is specified. Then it makes subsequent requests based on the include depth multiplied by number of included fields. For a typical use case of including 1-2 relationships, this would mean up to 3 roundtrips, independent of how many records are returned. The n+1 problem pertains to making separate database requests per record, which Fortune.js does not do, it actually batches all of the data at each nested level of include.

I agree that 1 aggregate request is faster than a few separate requests, I just don't think it would be that much faster since there is a fixed upper bound on the number of requests.

gr0uch avatar Oct 10 '18 20:10 gr0uch

Transaction support has been added: https://github.com/fortunejs/fortune-mongodb/commit/2d059cd6796f2bc440550d890440f79604d9e3a0

Published as 1.3.0.

Please note that this is an opt-in feature, since by default, MongoDB does not run in replica set mode, which is required for transactions.

gr0uch avatar Oct 10 '18 21:10 gr0uch

Thanks for adding transaction support. As mentioned above I am only talking about related records, multiple relationships mean multiple database roundtrips which is bad for a production-ready app/website with heavy traffic. MongoDB now has a better way of getting related documents with aggregation pipeline and $lookup operator is the recommended method

Aggregation is new find

I have read this quote somewhere on Mongodb website. My question is why multiple database round trips when we can fetch related documents with a single database request. I know graphql N+1 problem is a different issue but If fortunejs is used to fetch related records with a graphql https://github.com/genie-team/graphql-genie then it can become a huge performance bottleneck as graphql supports multi level nested queries. I would request you to consider using aggregation pipeline with $lookup operator to featch related records with one database request.

ansarizafar avatar Oct 11 '18 04:10 ansarizafar

I don't think you read what I wrote. Even with nested includes, the number of requests is O(1) since the nesting level is constant. You might be mistaken in thinking that it makes a seperate call to find for each related record, which is actually not the case. Related records are batched into as few finds as possible (specifically, the depth of included records). There is no N+1 problem, so I don't think your concern is legitimate.

gr0uch avatar Oct 11 '18 04:10 gr0uch

I think I have used the wrong terminology. Fortunejs performs one database roundtrip for each related collection. Multiple related collections mean multiple round trips. Is that correct? With $lookup we need a single round trip.

ansarizafar avatar Oct 11 '18 05:10 ansarizafar

I think there is no real problem, please let me know if you have actually analyzed database queries and found that there were excessive requests being made. I will wait.

In practical terms, you might have one or two nested includes for the vast majority of requests, which results in at most three round-trips, not an unbounded number. The complexity of adding such an optimization vastly outweighs the performance overhead of a constant number of additional queries. You seem to only understand that 3 is greater than 1, but you don't get that constant time is much better than linear time, so much that optimizing this will only see marginal improvements.

gr0uch avatar Oct 11 '18 05:10 gr0uch

Fortunejs approach was fine in old days but since Mongodb 3.6 $lookup is recommended by mongodb to fetch related collection. I was raising this issue in the context of Graphql specially https://github.com/genie-team/graphql-genie which is using fortunejs for data access. A multi-level nested graphql query with fortunejs means tens of database requests. Anyways If you don't like this idea then its ok.

ansarizafar avatar Oct 11 '18 05:10 ansarizafar

Do you understand that the version of MongoDB is not relevant to the topic at hand?

Please specify what "tens of database requests" means. It isn't even worth the effort arguing about this.

gr0uch avatar Oct 11 '18 05:10 gr0uch

Mongodb version is relevant as Mongodb introduced this new recommended method of fetching the related collections with $lookup with version 3.6. Anyone who has used graphql can understand what I mean. As I said its ok If you don't want to implement this feature. It's your project and you have every right to decide what you want.

ansarizafar avatar Oct 11 '18 05:10 ansarizafar

Whether one is using MongoDB less than 3.6 or greater than 3.6, the performance gains from using $lookup are minimal, and may even be slower! How is it possible that 2 requests are faster than 1?

Let's say that hypothetically, there are 10,000 records, each of them related to 1 record. By performing a $lookup on these 10,000 records, it will try to find this 1 record 10,000 times, and then it will embed this 1 record 10,000 times in the response. I am not even kidding, look at the examples in the official documentation for this pattern. Performance characteristics are very complex and can be hard to predict. You might have some theoretical concerns if the nesting level is unbounded, but in real world applications it is usually bounded.

gr0uch avatar Oct 11 '18 05:10 gr0uch