ml-app-deployer icon indicating copy to clipboard operation
ml-app-deployer copied to clipboard

Add tasks for retiring and detaching forests

Open khashayar opened this issue 4 years ago • 9 comments

Currently there is no way to delete a forest using ml-gradle.

khashayar avatar Oct 19 '20 08:10 khashayar

What interface do you have in mind here - i.e. what properties would you want to specify on the command line? Just -PforestName, or something else? How about deleting multiple forests at once, or deleting all forests for a database on a specific host or in a specific group?

rjrudin avatar Oct 19 '20 11:10 rjrudin

Also, what's the use case? "As an ml-gradle user, I want to delete a forest, so that...."

rjrudin avatar Oct 19 '20 12:10 rjrudin

I mainly need this task in association with removing a host mlRemoveHost from a cluster. As you know a host can only leave a cluster if there's no forest assigned to it, which is why I wanted to remove all the forests of a given host.

So back to you question, I would be nice to be able to:

  • Remove a single forest
  • Remove all the forests of a given database
  • Remove all the forests of a given host

Thank you.

khashayar avatar Oct 19 '20 12:10 khashayar

Btw... Speaking of mlRemoveHost, Is there a possibility to extend the functionality of that task so it can handle removing of the assigned forests, before leaving the cluster instead of just failing?

khashayar avatar Oct 19 '20 12:10 khashayar

If the use case is for removing a host, then you'd normally follow a procedure of (let's assume 3 hosts with 2 primary forests per host plus replicas, and data in every primary forest, and host 3 is the one you want to remove):

  1. Retire the forests on host 3
  2. After the rebalancer copies all the data to primary forests on hosts 1 and 2, detach the forests on host 3

After doing the above, you can safely remove host 3.

You've used the verbs "remove" and "delete" - are you really looking to automate the retire/detach process? Because at least for the process of removing a host, there's not a use case for deleting the forests - unless there's no data in them that needs to be rebalanced (you'd still need to detach them though).

Note that if you are looking to automate retire/detach, they'd be two separate Gradle tasks, as the rebalancing process could take hours to finish depending on the amount of data.

rjrudin avatar Oct 19 '20 12:10 rjrudin

What you described is exactly my use case... retiring and detaching a forest to be able for a host to leave a cluster.

I tested that manually with a forest with around 10gb and the rebalancing didn't take more than a few mins and since MarkLogic recommends max of 200gb per forest, I was guessing that it should still be manageable to automate this process, considering the rebalancing process.

If you think in reality, it doesn't make sense to automate the whole process then disregard my request and I would fall back to the multiple gradle task approach to achieve this.

khashayar avatar Oct 19 '20 13:10 khashayar

I think there's value in tasks like this:

./gradlew mlRetireForests -PdatabaseName=myDatabase -PhostNames=host3
./gradlew mlDetachForests -PdatabaseName=myDatabase -PhostNames=3

The catch is - how does ml-gradle, or any client, know when the rebalancer is finished? That isn't an ml-gradle problem, it's a problem for any client of the Manage API. If there's a program that can query the Manage API to know when the rebalancer is finished, that can be made an optional part of mlRetireForests or a new task itself - e.g. mlWaitForRebalancer -PdatabaseName=myDatabase.

You could then do everything like this:

./gradlew mlRetireForests mlWaitForRebalancer mlDetachForests -PdatabaseName=myDatabase -PhostNames=host3

And we could then have an "aggregate" task that does everything:

./gradlew mlRemoveForests -PdatabaseName=myDatabase -PhostNames=host3

Want to take a crack first at writing the program to know when the rebalancer is done?

rjrudin avatar Oct 19 '20 13:10 rjrudin

I like your idea on how to wait for the rebalancer to be done and the chaining of tasks and it could also be used for mlRemoveHost task!

khashayar avatar Oct 21 '20 07:10 khashayar

Moved this to ml-app-deployer as most of the work will need to occur here first.

rjrudin avatar Jul 21 '21 13:07 rjrudin