couchdb icon indicating copy to clipboard operation
couchdb copied to clipboard

CouchDB instance without search node crashes when calling a search request

Open mojito317 opened this issue 4 years ago • 4 comments

Description

CouchDB crashes if I bombard it with search requests when search node does not exist. See this gist file to check the logs.

Steps to Reproduce

I could only reproduce the issue when I ran unit tests.

  1. Start a plain CouchDB instance that does not contain the _search functionality. I used the latest docker image.
  2. Start the CloudantDatabaseTests in tests/unit/database.tests.py file from the cloudant/python-cloudant repo. I tried to skip as many tests as possible, so if you only keep these tests, you probably can reproduce the crash:
  • https://github.com/cloudant/python-cloudant/blob/master/tests/unit/database_tests.py#L1081-L1121
  • https://github.com/cloudant/python-cloudant/blob/master/tests/unit/database_tests.py#L1618-L1643
  • https://github.com/cloudant/python-cloudant/blob/master/tests/unit/database_tests.py#L1645-L1677

Expected Behaviour

CouchDB should not crash.

Your Environment

  • CouchDB version used: 3.1.1
  • Browser name and version: not a browser, I tried it via python-cloudant (master)
  • Operating system and version: macOS Big Sur, 11.3.1

Additional Context

You have to set the following env vars when starting the nosetest:

DB_USER={uname};DB_PASSWORD={pwd};DB_URL=http://127.0.0.1:5984;RUN_CLOUDANT_TESTS=false

mojito317 avatar Jun 04 '21 15:06 mojito317

All right, so the strange part is that I can reproduce this on couchdb:3 docker image with the following ddoc, but can't reproduce when I'm building from repo's 3.x branch (nor from tag 3.1.1) and changing dev/run's .ini configs to match docker's.

File: alpha.json

{
    "indexes": {
        "searchindex001": {
            "index": "function(doc) { index(\"default\", doc._id); }"
        }
    }
}

Against docker:

$ curl -q -K .curlrc http://127.0.0.1:15984/koi -X PUT
{"ok":true}

$ curl -q -K .curlrc http://127.0.0.1:15984/koi -X POST -d '{"name": "Alice", "number": 42}'
{"ok":true,"id":"d9d564521ee139ec83134600f4000e0f","rev":"1-7b54ca479c9043f8cc4cd83777ce6b75"}

$ curl -q -K .curlrc http://127.0.0.1:15984/koi/_design/alpha -X PUT --data @alpha.json
{"ok":true,"id":"_design/alpha","rev":"1-b445a33cf17da10d2f2502f68f58462b"}

$ curl -q -K .curlrc http://127.0.0.1:15984/koi/_design/alpha/_search/searchindex001 -X POST -d '{"query": "name:Alice*"}'
{"error":"{badarg,[{erlang,monitor,[process,{main,'[email protected]'}],[]},\n         {ioq,submit_request,2,[{file,\"src/ioq.erl\"},{line,187}]},\n         {ioq,maybe_submit_request,1,[{file,\"src/ioq.erl\"},{line,150}]},\n         {ioq,handle_info,2,[{file,\"src/ioq.erl\"},{line,123}]},\n         {gen_server,try_dispatch,4,[{file,\"gen_server.erl\"},{line,616}]},\n         {gen_server,handle_msg,6,[{file,\"gen_server.erl\"},{line,686}]},\n         {proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,247}]}]}","reason":"{gen_server,call,\n            [ioq,\n             {request,<0.286.0>,{pread_iolist,4490},other,<0.435.0>,undefined},\n             infinity]}","ref":2383229562}

Against repo:

$ curl -q -K .curlrc http://127.0.0.1:15984/koi -X PUT
{"ok":true}

$ curl -q -K .curlrc http://127.0.0.1:15984/koi -X POST -d '{"name": "Alice", "number": 42}'
{"ok":true,"id":"d9d564521ee139ec83134600f4000e0f","rev":"1-7b54ca479c9043f8cc4cd83777ce6b75"}

$ curl -q -K .curlrc http://127.0.0.1:15984/koi/_design/alpha -X PUT --data @alpha.json
{"ok":true,"id":"_design/alpha","rev":"1-b445a33cf17da10d2f2502f68f58462b"}

$ curl -q -K .curlrc http://127.0.0.1:15984/koi/_design/alpha/_search/searchindex001 -X POST -d '{"query": "name:Alice*"}'
{"error":"ou_est_clouseau","reason":"Could not connect to the Clouseau Java service at [email protected]"}

Since HEAD behaves as expected with ou_est_clouseau error, I suspect this is some kind of configuration issue, but I can't figure out values in ini config to reproduce it.

If anyone will manage to induce this locally in dev environment, please leave steps in a comment.

eiri avatar Aug 26 '21 16:08 eiri

Intriguing! I think I figured it out. The difference is that the couchdb:3 container is configured without any node name and is not running in distributed mode. In that mode trying to monitor the remote Clouseau process triggers a badarg error:

# /opt/couchdb/erts-9.3.3.14/bin/erl -boot /opt/couchdb/releases/3.1.1/start_clean
Erlang/OTP 20 [erts-9.3.3.14] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V9.3.3.14  (abort with ^G)
1> erlang:monitor(process, {main, '[email protected]'}).
** exception error: bad argument
     in function  monitor/2
        called as monitor(process,{main,'[email protected]'})

whereas in dev mode we are running a CouchDB Erlang node and the behavior changes to deliver a 'DOWN` message to the mailbox:

# /opt/couchdb/erts-9.3.3.14/bin/erl -boot /opt/couchdb/releases/3.1.1/start_clean -name [email protected]
Erlang/OTP 20 [erts-9.3.3.14] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V9.3.3.14  (abort with ^G)
([email protected])1> erlang:monitor(process, {main, '[email protected]'}).
#Ref<0.2127152124.2865758209.248364>
([email protected])2> receive M -> M end.
{'DOWN',#Ref<0.2127152124.2865758209.248364>,process,
        {main,'[email protected]'},
        noconnection}
([email protected])3> 

Not sure offhand what the right fix is here. Seems like we ought to be able to run gracefully without the Erlang distribution, although it's news to me that the Docker image is configured that way.

kocolosk avatar Aug 26 '21 21:08 kocolosk

From the Docker container README:

CouchDB also uses /opt/couchdb/etc/vm.args to store Erlang runtime-specific changes. Changing these values is less common. If you need to change the epmd port, for instance, you will want to bind mount this file as well. (Note: files cannot be bind-mounted on Windows hosts.)

and

NODENAME will set the name of the CouchDB node inside the container to couchdb@${NODENAME}, in the file /opt/couchdb/etc/vm.args. This is used for clustering purposes and can be ignored for single-node setups.

Try setting NODENAME?

wohali avatar Aug 27 '21 03:08 wohali

Hi @wohali yes, I can confirm that starting the container with a NODENAME specified restores the intended behavior. It just seems to me that we'd want to be better behaved inside CouchDB itself when running in non-distributed mode, but I'm not sure about the best way to effect that change.

kocolosk avatar Aug 27 '21 14:08 kocolosk

fixed by https://github.com/apache/couchdb/pull/4404

rnewson avatar Jan 26 '23 08:01 rnewson