cht-core
cht-core copied to clipboard
Don't hard code COUCHDB_SECRET in docker compose files
Describe the issue We hard code the secret for CouchDB in the Arch v3 docker setup as shown here:
"COUCHDB_SECRET=${COUCHDB_SECRET:-6c1953b6-e64d-4b0c-9268-2528396f2f58}"
This is insecure as it is public and will be used by default unless users override it.
Describe the improvement you'd like We should dynamically generate this at install time or mandate that users specify a unique one per install.
Describe alternatives you've considered NA
cc @garethbowen per our call today
I think we should not default the UUID too.
I had a look at this in the 7812-require-password
branch but ran out of time to get the build to pass. I think there's some issue with making the entire cluster use the same secret and UUID...
This is ready for AT on 7800-no-couch-secret
.
Please make sure that:
- it works when using single node CouchDb
- works when using clustered CouchDb
- for both, your sessions are persistent on container restart (not remove, just restart).
- for both, your checkpointer are persistent on container restart (track that offline users don't download all docs again if you restart CouchDb).
Compose files:
Thanks @dianabarsan for the steps and the files to test.
Here are the testing results using the files provided in the previous comment and the branch 7800-no-couch-secret
Using single node CouchDB
- The instance was up and running with no issues.
- The instance was persistent when the couchdb container was restarted.
Video attached
- I had problems when I tried to log in using an offline user. Not sure if I am missing something
Online user
Offline user
Video attached
Using clustered CouchDB
- Had an error when I tried to run the
docker-compose up
Error attached
cht-api | RequestError: Error: getaddrinfo ENOTFOUND haproxy
cht-api | at new RequestError (/api/node_modules/request-promise-core/lib/errors.js:14:15)
cht-api | at Request.plumbing.callback (/api/node_modules/request-promise-core/lib/plumbing.js:87:29)
cht-api | at Request.RP$callback [as _callback] (/api/node_modules/request-promise-core/lib/plumbing.js:46:31)
cht-api | at self.callback (/api/node_modules/request/request.js:185:22)
cht-api | at Request.emit (node:events:527:28)
cht-api | at Request.onRequestError (/api/node_modules/request/request.js:877:8)
cht-api | at ClientRequest.emit (node:events:527:28)
cht-api | at Socket.socketErrorListener (node:_http_client:454:9)
cht-api | at Socket.emit (node:events:527:28)
cht-api | at emitErrorNT (node:internal/streams/destroy:157:8) {
cht-api | cause: Error: getaddrinfo ENOTFOUND haproxy
cht-api | at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:71:26) {
cht-api | errno: -3008,
cht-api | code: 'ENOTFOUND',
cht-api | syscall: 'getaddrinfo',
cht-api | hostname: 'haproxy'
cht-api | },
cht-api | error: Error: getaddrinfo ENOTFOUND haproxy
cht-api | at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:71:26) {
cht-api | errno: -3008,
cht-api | code: 'ENOTFOUND',
cht-api | syscall: 'getaddrinfo',
cht-api | hostname: 'haproxy'
cht-api | }
cht-api | }
cht-sentinel | RequestError: Error: getaddrinfo ENOTFOUND haproxy
cht-sentinel | at new RequestError (/sentinel/node_modules/request-promise-core/lib/errors.js:14:15)
cht-sentinel | at Request.plumbing.callback (/sentinel/node_modules/request-promise-core/lib/plumbing.js:87:29)
cht-sentinel | at Request.RP$callback [as _callback] (/sentinel/node_modules/request-promise-core/lib/plumbing.js:46:31)
cht-sentinel | at self.callback (/sentinel/node_modules/request/request.js:185:22)
cht-sentinel | at Request.emit (node:events:527:28)
cht-sentinel | at Request.onRequestError (/sentinel/node_modules/request/request.js:877:8)
cht-sentinel | at ClientRequest.emit (node:events:527:28)
cht-sentinel | at Socket.socketErrorListener (node:_http_client:454:9)
cht-sentinel | at Socket.emit (node:events:527:28)
cht-sentinel | at emitErrorNT (node:internal/streams/destroy:157:8) {
cht-sentinel | cause: Error: getaddrinfo ENOTFOUND haproxy
cht-sentinel | at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:71:26) {
cht-sentinel | errno: -3008,
cht-sentinel | code: 'ENOTFOUND',
cht-sentinel | syscall: 'getaddrinfo',
cht-sentinel | hostname: 'haproxy'
cht-sentinel | },
cht-sentinel | error: Error: getaddrinfo ENOTFOUND haproxy
cht-sentinel | at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:71:26) {
cht-sentinel | errno: -3008,
cht-sentinel | code: 'ENOTFOUND',
cht-sentinel | syscall: 'getaddrinfo',
cht-sentinel | hostname: 'haproxy'
cht-sentinel | }
cht-sentinel | }
On the error, it looks like the haproxy container failed to come up. Can you check the logs? It can be something as simple as a port clash or something else.
I had problems when I tried to log in using an offline user. Not sure if I am missing something
From the video, it looks like your browser doesn't accept the self signed certificate and doesn't download the service worker, which is required for offline users. How do you usually handle self signed certificates?
About the certificate problems, I have never had this issue before. I was reading about it, so I tried using Firefox, exported the certificates and added them to the keychanin access
to be trusted, but that did not work for chrome, don't understand why because it is working fine in Firefox, will need to investigate a little bit more, but meanwhile, I was testing that the offline users didn't download all docs again when I restarted CouchDb container.
Video attached
offline users didn't download
Since your user only has 37 docs, then you would not notice them downloading in a sync unless you inspected the network requests, and check how many docs the server sends back.
About the error when I try to use the clustered CouchDB..
This is the error that it is showing the `cht-haproxy`
backend couchdb-servers
balance leastconn
retry-on all-retryable-errors
log global
retries 5
# servers are added at runtime, in entrypoint.sh, based on couchdb
server couchdb couchdb:5984 check agent-check agent-inter 5s agent-addr healthcheck agent-port 5555
[alert] 276/204913 (1) : parseBasic loaded
[alert] 276/204913 (1) : parseCookie loaded
[alert] 276/204913 (1) : replacePassword loaded
[NOTICE] 276/204913 (1) : haproxy version is 2.3.19-0647791
[NOTICE] 276/204913 (1) : path to executable is /usr/local/sbin/haproxy
[ALERT] 276/204913 (1) : parsing [/usr/local/etc/haproxy/backend.cfg:7] : 'server couchdb' : could not resolve address 'couchdb'.
[ALERT] 276/204913 (1) : Failed to initialize server(s) addr.
I don't have a lot of knowledge with docker so I just try changing the name of the COUCHDB_SERVERS
in the docker-compose_cht-core.yml
from couchdb
to couchdb.1
/couchdb.2
/couchdb.3
just to see what happened.
Using couchdb.1
the result was that the container that failed this time was the cht-api
with the error:
Error
2022-10-04 20:55:13 INFO: Translations loaded successfully
2022-10-04 20:55:14 INFO: Running installation checks…
2022-10-04 20:55:14 INFO: Medic API listening on port 5988
2022-10-04 20:55:14 ERROR: Fatal error initialising medic-api
2022-10-04 20:55:14 ERROR: FetchError: invalid json response body at http://haproxy:5984/medic/_all_docs?include_docs=true&startkey=%22_design%2F%22&endkey=%22_design%2F%EF%BF%B0%22 reason: Unexpected token < in JSON at position 0
at /api/node_modules/node-fetch/lib/index.js:272:32
at processTicksAndRejections (node:internal/process/task_queues:96:5) {
message: 'invalid json response body at http://haproxy:5984/medic/_all_docs?include_docs=true&startkey=%22_design%2F%22&endkey=%22_design%2F%EF%BF%B0%22 reason: Unexpected token < in JSON at position 0',
type: 'invalid-json',
[stack]: 'FetchError: invalid json response body at http://haproxy:5984/medic/_all_docs?include_docs=true&startkey=%22_design%2F%22&endkey=%22_design%2F%EF%BF%B0%22 reason: Unexpected token < in JSON at position 0\n' +
' at /api/node_modules/node-fetch/lib/index.js:272:32\n' +
' at processTicksAndRejections (node:internal/process/task_queues:96:5)',
name: 'FetchError'
}
Using couchdb.2
or couchdb.3
all the containers were up successfully but I am seeing this error:
Error
cht-sentinel | StatusCodeError: 503 - "<html><body><h1>503 Service Unavailable</h1>\nNo server is available to handle this request.\n</body></html>\n"
cht-sentinel | at new StatusCodeError (/sentinel/node_modules/request-promise-core/lib/errors.js:32:15)
cht-sentinel | at Request.plumbing.callback (/sentinel/node_modules/request-promise-core/lib/plumbing.js:104:33)
cht-sentinel | at Request.RP$callback [as _callback] (/sentinel/node_modules/request-promise-core/lib/plumbing.js:46:31)
cht-sentinel | at Request.self.callback (/sentinel/node_modules/request/request.js:185:22)
cht-sentinel | at Request.emit (node:events:527:28)
cht-sentinel | at Request.<anonymous> (/sentinel/node_modules/request/request.js:1154:10)
cht-sentinel | at Request.emit (node:events:527:28)
cht-sentinel | at IncomingMessage.<anonymous> (/sentinel/node_modules/request/request.js:1076:12)
cht-sentinel | at Object.onceWrapper (node:events:641:28)
cht-sentinel | at IncomingMessage.emit (node:events:539:35) {
cht-sentinel | statusCode: 503,
cht-sentinel | error: '<html><body><h1>503 Service Unavailable</h1>\n' +
cht-sentinel | 'No server is available to handle this request.\n' +
cht-sentinel | '</body></html>\n'
cht-sentinel | }
cht-haproxy | <150>Oct 4 21:37:00 haproxy[27]: 172.21.0.8,<NOSRV>,503,0,1,0,GET,/,-,admin,'-',222,-1,-,'-'
cht-api | StatusCodeError: 503 - "<html><body><h1>503 Service Unavailable</h1>\nNo server is available to handle this request.\n</body></html>\n"
cht-api | at new StatusCodeError (/api/node_modules/request-promise-core/lib/errors.js:32:15)
cht-api | at Request.plumbing.callback (/api/node_modules/request-promise-core/lib/plumbing.js:104:33)
cht-api | at Request.RP$callback [as _callback] (/api/node_modules/request-promise-core/lib/plumbing.js:46:31)
cht-api | at Request.self.callback (/api/node_modules/request/request.js:185:22)
cht-api | at Request.emit (node:events:527:28)
cht-api | at Request.<anonymous> (/api/node_modules/request/request.js:1154:10)
cht-api | at Request.emit (node:events:527:28)
cht-api | at IncomingMessage.<anonymous> (/api/node_modules/request/request.js:1076:12)
cht-api | at Object.onceWrapper (node:events:641:28)
cht-api | at IncomingMessage.emit (node:events:539:35) {
cht-api | statusCode: 503,
cht-api | error: '<html><body><h1>503 Service Unavailable</h1>\n' +
cht-api | 'No server is available to handle this request.\n' +
cht-api | '</body></html>\n'
cht-api | }
Not sure if this helps you or not, I just wanted to try different things 🙂
...you would not notice them downloading in a sync unless you inspected the network requests, and check how many docs the server sends back.
Thanks for pointing that @dianabarsan I think this video is better, isn't it?
Video
Unfortunately no :(
Pouch <-> Couch replication is optimized to not download a document if it already exists locally (and this check is made via the _revs_diff) call.
In your case, you should inspect the response of the changes requests after you restart the container (the one that doesn't fail). There should be no changes there at all (or 1-2 docs that were updated in the meantime).
Another option is to check that the since
parameter is never rolled back, so you would look at every /medic/_changes request and check the since
parameter, which should never go back to 0.
When checking, please be aware that there will be a _changes request for the users meta database. Checking that can also be used to verify, but then please be sure you manually sync once before restarting the container - the meta database doesn't automatically sync on startup.
I don't have a lot of knowledge with docker so I just try changing the name of the COUCHDB_SERVERS in the docker-compose_cht-core.yml from couchdb to couchdb.1/couchdb.2/couchdb.3 just to see what happened.
Looking into how core-eng/sre architected this, the readme specifies you were right! You do need to set them. But seperate them with ,
, not /
;) (note, readme is wrong! We want to use a single COUCHDB_SERVERS
, not discrete COUCHDB1_SERVER
etc - I'll open another PR to fix this tomorrow)
I was able to use these steps to test with clustered couch on this branch:
- Download cht-core and clustered couch from this branch
- call compose up with:
COUCHDB_SERVERS="couchdb.1,couchdb.2,couchdb.3" COUCHDB_PASSWORD=password COUCHDB_USER=medic docker-compose -f docker-compose_cht-couchdb-clustered.yml -f docker-compose_cht-core.yml up
Thank you @dianabarsan and @mrjones-plip for your help.
I think that I have tested everything correctly this time, here are the results:
Using single node CouchDB
- The instance was up and running with no issues.
- The instance was persistent when the couchdb container was restarted.
- Using the offline user the session persisted after the couchdb container was restarted and the
since
parameter never goes back to0
Video attached
Using clustered CouchDB
- Using the instructions from @mrjones-plip in the previous comment I was able to get the instance up and running with no issues.
- The instance was persistent when the couchdb container was restarted.
- And same thing using the offline user the session persisted after the couchdb.1, couchdb.2 and couchdb.3 containers were restarted and the
since
parameter never goes back to0
Video attached
@dianabarsan please let me know is there is anything else that I am missing and should test, and thanks again, I learned a lot from this ticket.
Excellent testing, thank you so much @tatilepizs !
Merged to master