Performance of "skip"
Description
I have a large Couchdb database with about 2 GB and I also use attachments to the documents.
Steps to Reproduce
time curl -X GET URL/_all_docs?include_docs=true&attachments=true&limit=1&skip=1 => Requires almost no time
time curl -X GET URL/_all_docs?include_docs=true&attachments=true&limit=1&skip=500 => Requires 16 seconds
time curl -X GET URL/_all_docs?include_docs=true&attachments=true&limit=1&skip=1000 => Requires 32 seconds
time curl -X GET URL/_all_docs?include_docs=true&attachments=true&limit=1&skip=2000 => Requires 62 seconds
When looking at this, please note that I use a limit=1 in all cases to get similar results. The sizes of the documents and attachments are about the same in all documents.
Expected Behaviour
I would expect the skip command to cause the data to be fetched only from the skip position. And this should be fast, regardless of the position of the document.
It seems that a skip command, however, would internally perform a full query of all data, and then throw away the data that should not be delivered (skipped). This leads to unnecessarily time-consuming queries for large databases. Especially when fetching documents that have a higher index.
Your Environment
{ "couchdb": "Welcome", "version": "3.1.1", "git_sha": "ce596c65d", "uuid": "a6101edd1ffd110ace6ed3e32a80417f", "features": [ "access-ready", "partitioned", "pluggable-storage-engines", "reshard", "scheduler" ], "vendor": { "name": "The Apache Software Foundation" } }
- CouchDB version used: 3.1.1
- Browser name and version: curl
- Operating system and version: linux / docker
Additional Context
You're correct about the current implementation of the skip parameter. We keep statistics at the shard level that allow for efficient skipping of rows, but there's a bit of subtlety in computing the correct number of rows to skip in each shard of a database comprised of multiple shards. I sketched a solution to the problem in JIRA several years ago:
https://issues.apache.org/jira/browse/COUCHDB-2784
Quoting here:
For a database with Q shards and a request specifying ?skip=X We know that either a) at least one of the shards will end skipping at least
X / Qrows, or b) the entire response body will be empty. So, I propose the following:
- Set the per-shard skip value to
X div Q
- If a shard has fewer than
X div Qrows remaining it should send its last row- If
X div Qis zero we can short-circuit and just use the current algorithm.- The coordinator sorts the first responses from each shard. It then sends the key of the row that sorts first (let's call it Foo) back to all the shards
- Each shard counts the number of rows in between the original startkey and Foo and sends that number, then starts streaming with Foo as the new startkey
- The coordinator deducts the computed per-shard skip values from the user-specified skip and then takes care of the remainder in the usual way we do it today (i.e. by consuming the rows as they come in).
[T]he algorithm could be executed recursively, e.g. if after one pass we're still left trying to consume a million rows in the coordinator we can farm that out to the shards again and identify an even better start key (Foo 2.0). Lather, rinse, repeat until the number of rows that would be consumed by the coordinator is acceptably small.
It's a moderately complex patch, and won't apply to 4.0. Efficient skips might be possible in 4.0 but it depends a bit on what sorts of statistics we might maintain in any index data structures we implement over FoundationDB (FDB itself doesn't provide any).
I think we have a documentation bug here; we imply that large values of skip are fast for any release after 1.2, when it's really only fast for 1.x where x>2. But I would not call this a bug in CouchDB itself.
for completeness sake, this is where documentation bug is: https://docs.couchdb.org/en/stable/ddocs/views/pagination.html#paging-alternate-method