thin-edge.io icon indicating copy to clipboard operation
thin-edge.io copied to clipboard

Force manually service restart on upgrade

Open cstoidner opened this issue 3 years ago • 3 comments

Proposed changes

If a user wants to update an existing thin-edge instance on it's device, he must actually stop services manually before any upgrade. It's better to use tedge disconnect instead of stopping services directly with systemctl, as the knowledge which services to handle is implemented in these commands. Finally it is consequent to let the user afterwards also start services also manually again (using tedge connect) instead of trying to start in the DEB maintainer scripts.

Types of changes

What types of changes does your code introduce to thin-edge.io? Put an x in the boxes that apply

  • [ ] Bugfix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [x] Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • [x] Documentation Update (if none of the other choices apply)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue


None

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • [x] I have read the CONTRIBUTING doc
  • [x] I have signed the CLA (in all commits with git commit -s)
  • [ ] I ran cargo fmt as mentioned in CODING_GUIDELINES
  • [ ] I used cargo clippy as mentioned in CODING_GUIDELINES
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [x] I have added necessary documentation (if appropriate)

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

cstoidner avatar Feb 17 '22 16:02 cstoidner

I'm a bit lost regarding the plan beyond that PR.

Okay, I have a better understanding after your latest comment on the self-update plugin.

So the plan is:

  • The cloud admin user trigger an update of thin-edge.
  • The agent receives this request via the mapper and forwards it the self-update plugin.
  • The self-update plugin
    • downloads the packages.
    • calls tedge disconnect, stopping the agent and the mappers
    • install the packages
    • let somewhere a message for the agent to tell the status of the self-update
    • calls tedge connect, restarting the new agent and mappers using the new tedge.
    • makes sure the connection is back with tedge connect <cloud> test
    • and dies.
  • On restart, the agent read the message from the plugin and forwards this status to the cloud via the mapper.

Is this correct?

What I understand now sounds as a good direction of work. Notably, I think we can then address corner cases as "what if tedge connect c8y --test fails". The question is still open but we can at least detect the issue while this would have been trickier in the post-install scripts.

didier-wenzek avatar Feb 17 '22 18:02 didier-wenzek

So the plan is:

[...]

Is this correct?

@didier-wenzek, yes, that's correct. Even though my aim for that PR was to have a more convenient update procedure for a customer/developer to update thin-edge manually. And as you stated above that's the procedure also to use for the self-update.

Notably, I think we can then address corner cases as "what if tedge connect c8y --test fails". The question is still open but we can at least detect the issue while this would have been trickier in the post-install scripts.

I agree. At least it gives the chance for the customer to add some device-specifc rollback or rescue stuff there. And in a next drop we may potentially think about implement some rollback/rescue already by thin-edge.

cstoidner avatar Feb 18 '22 10:02 cstoidner

Since the expectation from the user is to have all the services shutdown before an upgrade, wouldn't it be better if we validate that too from our debian maintainer scripts? May be, from the prerm or preinst commands which will fail if they detect that these services are running during an update?

We already have such script in prerm on tedge-agent and tedge-mapper.

if command -v systemctl >/dev/null; then
    if systemctl is-active --quiet tedge-agent; then
        print_hint "tedge-agent"
        exit 1
    fi
fi

The checked services are

  • tedge-agent
  • tedge-mapper-c8y
  • tedge-mapper-az
  • tedge-mapper-collectd

If user didn't stop those service, we don't remove old packages, means failing to upgrade and tell user to stop services.

rina23q avatar Feb 22 '22 10:02 rina23q

No longer applicable, since restart behaviour on upgrade significantly changed since 0.8.0, see https://github.com/thin-edge/thin-edge.io/issues/1466

cstoidner avatar Nov 22 '22 14:11 cstoidner