Document couchdb disaster recovery process
Description
i have a couchdb 3.1 cluster with 2 nodes VM and i backup daily data and etc folder i would like to create, starting from backup same servers and data in a new environment (i would like to simulate a complete down on the old infrastructure) i setup two couchdb, than i restore the 2 folder, than i change the ipaddress under vm.args i add and remove the ip on the cluster to matching the actual configuration https://docs.couchdb.org/en/stable/cluster/nodes.html#adding-a-node
in the interface i can see all the databases but are in status [This database failed to load.] in the log file i have [Failed to ensure auth ddoc _users/_design/_auth exists for reason: read_failure] if i try to browse _users/_design/_auth i receave error "internal_server_error" reason "No DB shards could be opened." ref 2822102114
there is a step by step guide to do this?
Steps to Reproduce
Expected Behaviour
disaster solution for a cluster starting from backup
Your Environment
- CouchDB version used:3.1
- Browser name and version:
- Operating system and version:ubuntu 18
Additional Context
Hi there,
This is not a CouchDB bug. GitHub is for actual CouchDB bugs only. I will, however, keep this open as a request for more documentation.
If you are looking for general support with using CouchDB, please try one of these other options:
The problem you're facing is that the -name parameter that you're changing is used by CouchDB internally for every database to log where the shards are stored.
If your entries in -name were DNS entries, not IP addresses, you'd do exactly what you said, change DNS, and everything would be fine.
If you insist on using IP addresses, you're going to have to edit the _dbs document for every database you have, and change all instances of every IP address for all machines to the new IP address. Not only is this error prone, it's a pain in the arse :)
So, switch to DNS, and always put -name [email protected] in vm.args, not IP addresses.
We addressed the -name portion of this in apache/couchdb-documentation#596.
Hi, I would like to contribute to this issue.
I can prepare a new documentation page covering:
Correct steps for disaster recovery from backups
How node names and DNS affect shard loading
Why IP-based -name values cause failures
A clear, step-by-step cluster restoration guide
Common errors and their fixes
I reviewed the repository and plan to add the guide under src/docs/maintenance/. Please let me know if this approach looks good. I will open a PR accordingly.
Thanks!
Hi @piyahub,
if there is something missing in the docs or could be improved, please go ahead and open a PR. We will discuss the changes there. You can find some contribution tips in our guide and I would like to draw particular attention to the use of artificial intelligence.