roxy icon indicating copy to clipboard operation
roxy copied to clipboard

Fix for Issue 832 - Replicate should only work for hosts > 3.

Open jamsilvia opened this issue 7 years ago • 11 comments

When trying to generate replicas for forests, boostrap checked for number of hosts > 1, and if not, it skipped replication. However, wipe did not perform this check and failed if a boostrap with replicas was attempted on a single host. Both of these cases are wrong in a way. And both should be treated consistently. For both, replicas should not be attempted unless the number of hosts > 3. When doing internal replicas on a cluster of < 3 hosts (which requires an explicit command line argument) - a failure should be returned. When doing content replication on a cluster of < 3 hosts, it should only cause a warning. Since the same ml-config may be used for a production cluster and a single node in dev, this should not cause a failure. Warnings are reported as a result of the bootstrap in this case. #832

jamsilvia avatar Sep 06 '17 02:09 jamsilvia

PR was against master branch. I edited to rebase against dev instead..

grtjn avatar Sep 18 '17 13:09 grtjn

Let me run a test with this. It was intentionally ignoring replication and such in a single-host situation, assuming someone was attempting to deploy the project on a local box for dev-only purposes.

It would be nice if that is still possible, without needing to edit ml-config.xml..

grtjn avatar Sep 18 '17 13:09 grtjn

Ah, reading the description above, the single dev node case should be handled, but running a test won't harm anyhow..

grtjn avatar Sep 18 '17 13:09 grtjn

A quick initial test seems to show it works as described:

$ ./ml local bootstrap --replicate-internals
ERROR: Forest replication and failover requires at least 3 nodes to function properly.  This cluster has only 1 node(s).

and

$ ./ml ml9 bootstrap
Bootstrapping your project into MarkLogic 9 on ml9-ml1...
WARNING: Forest replication and failover requires at least 3 nodes to function properly.  This cluster only has 1 node(s).  Defaulting to using no replicas.

The warning is printed twice though. I might take a second look at that..

grtjn avatar Sep 18 '17 17:09 grtjn

I took me a while before the penny dropped. The hosts check is repeated for each database and each forest. I think we should optimize that a bit. Let me give some thought on how best to improve that..

grtjn avatar Sep 19 '17 10:09 grtjn

I can look at it if you are swamped. Let me know.

jamsilvia avatar Sep 19 '17 17:09 jamsilvia

Whoever is quickest.. :)

The current code is not entirely right either actually. Replication only works properly with a cluster of at least three nodes, but that is not what count($hosts) < 3 is checking. $hosts contains group-hosts, not all cluster hosts.

I was considering the thought of a kind of pre-flight check. So, instead of checking this during bootstrap itself, the bootstrap function in Ruby would invoke a different function first. Perhaps something like setup:check-config. Or, for the sake of simplicity, setup:do-setup could invoke a setup:check-config before invoking all the creates. And that check-config would then check for replication and hosts counts and such (and potentially other checks in future), and pass back the extra messages (preventing need for the $messages map:map)..

Or something like that..

grtjn avatar Sep 19 '17 18:09 grtjn

It's a good point about the pre-flight. The problem I do see with checking in the Ruby is that we'll have to read and parse the ml-config, since the forest replica specification is in the ml-config. The check for internal replicas is in the Ruby since internal replicas is specified on the command line.

For hosts, I'll have to look at this again. The hosts should be what is being used for the forest/replica distribution; regardless of where they come from.

joe

jamsilvia avatar Sep 20 '17 01:09 jamsilvia

Yeah, I didnt mean parsing and checking the config with ruby. Setup:check-config would live in setup.xqy..

grtjn avatar Sep 20 '17 05:09 grtjn

Oh, I see what you mean. I didn't take the "setup:check-config" as a call back to xqy. Yes, that makes sense!

I plan to take some time tomorrow to work on it - let me know if you get to it first, so we are not both working on the same thing.

Thanks!

jamsilvia avatar Sep 20 '17 12:09 jamsilvia

FYI: I started poking at this..

grtjn avatar Sep 25 '17 13:09 grtjn