citus_docs icon indicating copy to clipboard operation
citus_docs copied to clipboard

On-prem high availability

Open begriffs opened this issue 7 years ago • 0 comments

Why are we implementing it? (sales eng)

What are the typical use cases?

Communication goals (e.g. detailed howto vs orientation)

We have a section (http://docs.citusdata.com/en/v7.5/cloud/availability.html) which already covers:

  • Architecture diagrams
  • What is it
  • How does it work

But that is in the context of Cloud. What we want to describe now is setting up HA locally. Show the steps and commands.

Also include info about how to handle worker failure during streaming replication:

  • Update pg_dist_node with new worker address, or refer to the workers via DNS which can point wherever it needs to point
  • (No information is lost, it simply shows them errors)
  • (But the coordinator will need to know the new worker address)

Good locations for content in docs structure

  • New "Availability" section in Administer?
  • Add link to new section from http://docs.citusdata.com/en/v7.5/admin_guide/cluster_management.html#dealing-with-node-failures

Corner cases, gotchas

Are there relevant blog posts or outside documentation about the concept/feature?

Dmitri has some docs

begriffs avatar Aug 08 '18 16:08 begriffs