mongo-cursor-pagination icon indicating copy to clipboard operation
mongo-cursor-pagination copied to clipboard

[Feature Proposal] Compute cursor for each document

Open simararneja opened this issue 4 years ago • 18 comments

Feature request & proposal

Highlighting the relay specification for the cursor based pagination which specifies a cursor for each document. As things are today, this library returns pagination metadata as separate fields (next, previous, etc..) which points to the last document of the list.

I propose to add an additional field _cursor for each document which will allow the user to move in forward/backward direction from any node in a list.

Example -


{
 "results":
   [ 
      { 
        " _id" : "580fd16aca2a6b271562d8bb", 
        "counter": 4, 
        "_cursor": "eyIkb2lkIjoiNTgwZmQxNmFjYTJhNmIyNzE1NjJkOGJiIn0" },
      {
       " _id": "580fd16aca2a6b271562d8ba",
       "counter": 3, 
        "_cursor": "eyIkb2lkIjoiNTgwZmQxNmFjYTJhNmIyNzE1NjJkOGJhIn0" 
      } 
  ],
  "next": "eyIkb2lkIjoiNTgwZmQxNmFjYTJhNmIyNzE1NjJkOGJhIn0",
  "hasNext": "true" 
}

simararneja avatar Apr 22 '20 07:04 simararneja

Let me know your thoughts on this. If this seems ok, don't mind opening a PR for this.

simararneja avatar Apr 22 '20 07:04 simararneja

Interesting and could be useful. Would this be a replacement for the current API? What are the advantages of having _cursors on each result?

bradvogel avatar Apr 22 '20 12:04 bradvogel

What are the advantages of having _cursor on each result

  1. A step closer to relay specification for connection based pagination for backend services ;- )

  2. Allows to paginate forwards and backwards from any node - also virtually assists in slicing and add more to the result

  3. Allows to cache smaller lists or section of a result for faster loading of User interfaces. Example -For scrolling pagination it helps to fetch the rest of the list (or more) without having to cache the whole result and metadata.

Would this be a replacement for the current API?

I don't foresee it to be a replacement, more like an addition to the existing. Can be optionally enabled by adding another param say computeCursorOnEveryNode( or a better name) of type boolean (default can be false since it does additional computation post retrieving the results from db).

simararneja avatar Apr 23 '20 05:04 simararneja

Yes, sounds useful. I'd accept a PR. Perhaps for now make it an option?

bradvogel avatar Apr 23 '20 13:04 bradvogel

Yes, sounds useful. I'd accept a PR. Perhaps for now make it an option?

bradvogel avatar Apr 23 '20 13:04 bradvogel

Also, interest in, relay is very opinionated in terms of data structure that's why @simararneja is talking about cursor per document.

Relay is driving pagination spec for Graphql, a lot of services take it as "best practice". Having the ability to use document cursor would help to implement relay compliant webservers

ScreamZ avatar May 09 '20 11:05 ScreamZ

Also is there any way to get the total count instead of HasNext ? I can mimic that by running countDocuments with the query before the Mongo query but maybe you have another suggestion ? :)

ScreamZ avatar May 09 '20 12:05 ScreamZ

Also is there any way to get the total count instead of HasNext ? I can mimic that by running countDocuments with the query before the Mongo query but maybe you have another suggestion ? :)

This library doesn't provide a way to get a total count. I don't think we'd want that - even if it was opt-in-only - given the performance cost of running such query.

bradvogel avatar May 09 '20 18:05 bradvogel

Any updates on this?

cerinoligutom avatar Aug 19 '20 10:08 cerinoligutom

Forked the project to create a PR for this, but npm install is showing me this error: image @bradvogel any tips on how to make it work?

ayrtonvwf avatar Nov 28 '20 16:11 ayrtonvwf

For a project of mine we're sorting by the createdAt field, so the document cursor encoding is being done like that:

image

@bradvogel any tips on how to re-use some internal functions to implement it in a generic way?

ayrtonvwf avatar Nov 28 '20 16:11 ayrtonvwf

I'm implementing a POC based on this logic: image

First I implemented a generic cursor builder function: image

My guess is that in future we could replace that first logic with this function: image

Or maybe removing these if's at all, using the cursors already computed for the first and last result: image

What do you think about this approach @bradvogel?

I still got no chance to test it because of the npm install error though.

ayrtonvwf avatar Nov 28 '20 17:11 ayrtonvwf

@ayrtonvwf

Thanks for letting us know about the issue npm installing. That should now be fixed.

Can you share more about what you're trying to do? Are you putting a cursor on each document returned?

bradvogel avatar Dec 02 '20 17:12 bradvogel

@bradvogel thanks, npm install now is working fine!

I don't think putting a cursor on each document returned would be the best approach, that could introduce some hard to debug issues, but I think it's a good starting point as a prototype to show it working.

The optimal approach would be instead of returning an array of documents, return an array of results which each result would include the document and some metadata, including the cursor.

[
    {
        "cursor": "string",
        "document": {
            "_id": "string"
        }
    }, {
        "cursor": "string",
        "document": {
            "_id": "string"
        }
    }
]

That's the approach seen in Relay Connections specification, so a good slice of the community would be already familiar with it. The cursor of course would be the first metadata, but I can imagine it growing in the future with other requirements like a score property (like Elasticsearch's). This approach wouldn't be backwards compatible though, and I feel like maintaining two similar set of functions just to keep compatibility would lead to some headaches.

Another valid approach would be returning a separated array of cursors, where the relationship would be maintained with the indexes of the documents on the result:

{
    "results": [
        { "_id": "string" },
        { "_id": "string" }
    },
    "cursors": [
        "string",
        "string"
    ]
}

I see some friction with the developer experience here, but this approach would be backwards compatible and wouldn't mix the cursor with the document fields.

I'm looking forward to discuss the options here, I would love to see other's considerations about each approach:

  • document cursor injection: inject the cursor inside the document
  • result object: create one result object per document returned, containing the document itself and the cursor as metadata
  • cursor list: return a list of cursors maintaining the relationship with the list of documents based on the index of the arrays

ayrtonvwf avatar Dec 08 '20 18:12 ayrtonvwf

I'm trying to run the tests, but when I run npm run test I get the following output where it seems to be downloading MongoDB:

image

I do have a freshly intalled MongoDB instance runing on my machine available at localhost on the default port, as pointed in the readme.

After 5 seconds the tests seem to timeout: image

ayrtonvwf avatar Dec 08 '20 19:12 ayrtonvwf

Added the POC of this feature, waiting for guidance on which approach to take on this feature.

ayrtonvwf avatar Dec 08 '20 20:12 ayrtonvwf

Re: mongodb installing when running npm test. Strange. I'm not seeing that. My repro steps:

git clone [email protected]:mixmaxhq/mongo-cursor-pagination.git
cd mongo-cursor-pagination
npm install
npm test

Perhaps mongodb-memory-server (https://www.npmjs.com/package/mongodb-memory-server) isn't able to cache its version of Mongo.

Re: adding _cursor to each document. Looks like a useful feature. Perhaps make it opt-in via a parameter so we don't need to change the API?

bradvogel avatar Dec 10 '20 00:12 bradvogel

Pushed some updates in my PR to include the _cursor property on the results based on a new includeCursor param. It should be working but I still couldn't test it. Tried tweaking the version property of the MongoMemoryServer, it Didn't try to download it again but the results were the same when I:

  • Set MongoMemoryServer to the version of Mongo I was using (4.4.2)
  • Installed Mongo 3.6.21 on my machine (latests 3.6 available)

image

ayrtonvwf avatar Dec 10 '20 02:12 ayrtonvwf