thin-edge.io
thin-edge.io copied to clipboard
Force manually service restart on upgrade
Proposed changes
If a user wants to update an existing thin-edge instance on it's device, he must actually stop services manually before any upgrade. It's better to use tedge disconnect
instead of stopping services directly with systemctl
, as the knowledge which services to handle is implemented in these commands. Finally it is consequent to let the user afterwards also start services also manually again (using tedge connect
) instead of trying to start in the DEB maintainer scripts.
Types of changes
What types of changes does your code introduce to thin-edge.io
?
Put an x
in the boxes that apply
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
- [x] Documentation Update (if none of the other choices apply)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
Paste Link to the issue
None
Checklist
Put an x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.
- [x] I have read the CONTRIBUTING doc
- [x] I have signed the CLA (in all commits with git commit -s)
- [ ] I ran
cargo fmt
as mentioned in CODING_GUIDELINES - [ ] I used
cargo clippy
as mentioned in CODING_GUIDELINES - [ ] I have added tests that prove my fix is effective or that my feature works
- [x] I have added necessary documentation (if appropriate)
Further comments
If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...
I'm a bit lost regarding the plan beyond that PR.
Okay, I have a better understanding after your latest comment on the self-update plugin.
So the plan is:
- The cloud admin user trigger an update of thin-edge.
- The agent receives this request via the mapper and forwards it the self-update plugin.
- The self-update plugin
- downloads the packages.
- calls
tedge disconnect
, stopping the agent and the mappers - install the packages
- let somewhere a message for the agent to tell the status of the self-update
- calls
tedge connect
, restarting the new agent and mappers using the new tedge. - makes sure the connection is back with
tedge connect <cloud> test
- and dies.
- On restart, the agent read the message from the plugin and forwards this status to the cloud via the mapper.
Is this correct?
What I understand now sounds as a good direction of work. Notably, I think we can then address corner cases as "what if tedge connect c8y --test
fails". The question is still open but we can at least detect the issue while this would have been trickier in the post-install scripts.
So the plan is:
[...]
Is this correct?
@didier-wenzek, yes, that's correct. Even though my aim for that PR was to have a more convenient update procedure for a customer/developer to update thin-edge manually. And as you stated above that's the procedure also to use for the self-update.
Notably, I think we can then address corner cases as "what if
tedge connect c8y --test
fails". The question is still open but we can at least detect the issue while this would have been trickier in the post-install scripts.
I agree. At least it gives the chance for the customer to add some device-specifc rollback or rescue stuff there. And in a next drop we may potentially think about implement some rollback/rescue already by thin-edge.
Since the expectation from the user is to have all the services shutdown before an upgrade, wouldn't it be better if we validate that too from our debian maintainer scripts? May be, from the
prerm
orpreinst
commands which will fail if they detect that these services are running during an update?
We already have such script in prerm
on tedge-agent
and tedge-mapper
.
if command -v systemctl >/dev/null; then
if systemctl is-active --quiet tedge-agent; then
print_hint "tedge-agent"
exit 1
fi
fi
The checked services are
- tedge-agent
- tedge-mapper-c8y
- tedge-mapper-az
- tedge-mapper-collectd
If user didn't stop those service, we don't remove old packages, means failing to upgrade and tell user to stop services.
No longer applicable, since restart behaviour on upgrade significantly changed since 0.8.0, see https://github.com/thin-edge/thin-edge.io/issues/1466