couchdb {db}/_changes?filter=_selector lags towards almost the end of the response

{db}/_changes?filter=_selector lags towards almost the end of the response

Open igorski89 opened this issue 6 years ago • 1 comments

While benchmarking the performance of mango selectors I stumbled upon an interesting behaviour: towards almost the end of the response, where there's 3 - 5 - 10 more changes expected to be received the response pauses from 0.5 sec to more, depending on the doc_count in the db, then finishes as normal.

Expected Behavior

No pause/lag in the _changes feed response

Current Behavior

Consistent one-time pause/lag towards the end of the response. The bigger the db – the bigger the pause.

Possible Solution

Cannot suggest much at the moment, still looking into the code.

Steps to Reproduce (for bugs)

run the latest version from the official Docker container docker run -p 5984:5984 -e COUCHDB_USER=admin -e COUCHDB_PASSWORD=password --name couchdb apache/couchdb
configure as a single node
generate a db using the following script https://gist.github.com/evsukov89/0101dd1fcea728ae6c5807eef70a7820
query the _changes feed curl -s -u admin:password -H "Content-Type: application/json" -X POST 'http://127.0.0.1:5984/db-1000/_changes?filter=_selector' --data '{"selector":{"owner":"global"}}'
observe the lag towards the end of the response, see example video https://www.dropbox.com/s/jc99cnyytznbmt6/couchdb2_filter_selector_lag.mov?dl=0

Change the NUMGLOBAL variable in the script to create more/less documents and experiment with different database sizes.

Context

I'm investigating mango selectors for replication as an alternative to view _change which appears not to be functional in 2.x (see #831). In my use-case, I operate over a multi-tenant environment (think one couch database – one application instance/project/customer/etc.), a subset of documents, typically around 10% is being replicated by mobile clients. Having a performant way to get a filtered _changes feed is a must.

While it's not really a bug nor a blocker in my case, the delay present 100% during my tests and is proportional to the number of docs in the db; I would simply like to understand the root cause and potential performance implications when having 100s of DBs on the same cluster with 10s of thousands of mobile clients replicating over with typical db size of 50-80k docs.

Your Environment

Version used:

Docker version 17.12.0-ce, build c97c6d6

{"couchdb":"Welcome","version":"2.1.1","features":["scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

Browser Name and version:

curl 7.54.0 (x86_64-apple-darwin17.0) libcurl/7.54.0 LibreSSL/2.0.20 zlib/1.2.11 nghttp2/1.24.0 
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz HTTP2 UnixSockets HTTPS-proxy

Operating System and version (desktop or mobile):

ProductName:	Mac OS X
ProductVersion:	10.13.3
BuildVersion:	17D47

Link to your project: n/a

Feb 07 '18 11:02 igorski89

Cleaning some old issues, I was trying to validate if this performance problem still occurs.

It seems the reproducer gist script is now deleted. @igorski89 by chance would you still have it around somewhere?

Nov 04 '21 16:11 nickva

Close as there was no response and the reproducer gist is missing.

Nov 13 '23 04:11 nickva

couchdb couchdb copied to clipboard

{db}/_changes?filter=_selector lags towards almost the end of the response

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment

couchdb
couchdb copied to clipboard