citus_docs
citus_docs copied to clipboard
On-prem high availability
Why are we implementing it? (sales eng)
What are the typical use cases?
Communication goals (e.g. detailed howto vs orientation)
We have a section (http://docs.citusdata.com/en/v7.5/cloud/availability.html) which already covers:
- Architecture diagrams
- What is it
- How does it work
But that is in the context of Cloud. What we want to describe now is setting up HA locally. Show the steps and commands.
Also include info about how to handle worker failure during streaming replication:
- Update pg_dist_node with new worker address, or refer to the workers via DNS which can point wherever it needs to point
- (No information is lost, it simply shows them errors)
- (But the coordinator will need to know the new worker address)
Good locations for content in docs structure
- New "Availability" section in Administer?
- Add link to new section from http://docs.citusdata.com/en/v7.5/admin_guide/cluster_management.html#dealing-with-node-failures
Corner cases, gotchas
Are there relevant blog posts or outside documentation about the concept/feature?
Dmitri has some docs