longhorn icon indicating copy to clipboard operation
longhorn copied to clipboard

[IMPROVEMENT] Improve the replica deletion workflow

Open shuo-wu opened this issue 2 years ago • 1 comments

Is your improvement request related to a feature? Please describe

As mentioned in this comment, the current replica removal workflow is:

  1. Longhorn manager directly stops the running replica process
  2. The engine process realizes the unavailability of the replica, set the mode to ERR, then reports it to the longhorn manager
  3. Longhorn manager removes the replica record from the engine spec (then status) after receiving the report, and asks the engine process to stop monitoring the replica.

Describe the solution you'd like

The running replica removal workflows can be a reverse of the volume attachment flow. And it's better to ask Longhorn manager to control everything rather than relying on the engine process's report:

  1. Longhorn manager asks the engine process to stop tracking/monitoring the running replica process.
  2. Longhorn manager wait for the tracking stopped then delete the replica process

Not sure if this is applicable or makes the whole workflow easier. What do you think? @joshimoo @PhanLe1010 @innobead

shuo-wu avatar Aug 05 '22 10:08 shuo-wu

With your proposal, we can also avoid some misleading error messages when the engine panic because it cannot connect to the its replicas

PhanLe1010 avatar Aug 05 '22 20:08 PhanLe1010

Hey team! Please add your planning poker estimate with Zenhub @derekbit @ejweber @PhanLe1010

shuo-wu avatar Mar 15 '24 01:03 shuo-wu