multipass icon indicating copy to clipboard operation
multipass copied to clipboard

[feat] Add wait-ready command to check Multipass daemon readiness

Open rluna319 opened this issue 6 months ago • 1 comments

Feature: Add multipass wait-ready command

Description: This PR introduces a new CLI command, multipass wait-ready. This command allows users and scripts to pause execution until the Multipass daemon is fully initialized and ready to accept requests. This is particularly beneficial for automation workflows that need to ensure the daemon is operational before proceeding with other Multipass operations.

The wait-ready command works by:

  1. Initially polling the daemon's gRPC socket until it becomes available.
  2. Once the socket is available, it sends a WaitReady request to the daemon.
  3. The daemon, upon receiving this request, verifies its readiness by attempting to update image manifests (which implicitly checks connectivity to image servers).
  4. The command supports a --timeout option (in seconds). If the daemon isn't ready within the specified timeout, the command exits with an error. If no timeout is given, it waits indefinitely.
  5. An animated spinner and context-relevant message provide user feedback during the waiting period.

Motivation:

  • Improves the reliability of automated scripts and CI/CD pipelines that interact with Multipass by providing a deterministic way to wait for daemon initialization.
  • Addresses scenarios where subsequent Multipass commands might fail if the daemon is still starting up.

Related Issues: Fixes #452 Fixes #3723

Key Implementation Changes:

  • Client-side (wait_ready.cpp): New command logic, including retry for socket connection, timeout handling (reusing mp::cmd::parse_timeout), and spinner integration.
  • Protocol (multipass.proto): Added WaitReadyRequest / WaitReadyReply messages and WaitReady RPC method.
  • Daemon-side (daemon_rpc.cpp, daemon.cpp):
    • New gRPC service handler in daemon_rpc.cpp.
    • Core readiness check in daemon.cpp leverages the existing wait_update_manifests_all_and_optionally_applied_force() function.

How to Test:

  1. Build this branch and $ cd build.
  2. Ensure the Multipass daemon is stopped.
  3. To verify --timeout, simply run the wait-ready command with the --timeout option:
$ ./bin/multipass wait-ready --timeout 5

Be sure not to start up the daemon first or the command might exit before the timeout.

  • Expected: Info message "Waiting for Multipass daemon to be ready...", spinner is shown, command times out after ~5 seconds with an appropriate error message "Timed out waiting for Multipass daemon to be ready."
  1. (Optional) To replicate the issue that happens when the client tries to use the daemon before its ready (without wait-ready), start up the daemon and immediately use multipass launch:
$ sudo -v
$ (sudo ./bin/multipassd &) && ./bin/multipass launch
  • Expected: Error message "launch failed: cannot connect to the multipass socket", "launch failed: Remote "" is unknown or unreachable.", or something like issue #3723
  1. Stop the Multipass daemon
$ sudo pkill multipassd
  1. To verify wait-ready command solves the issue, fit the wait-ready command between daemon startup and multipass launch:
$ sudo -v
$ (sudo ./bin/multipassd &) && ./bin/multipass wait-ready && ./bin/multipass launch

Optionally, add --timeout if desired.

  • Expected: Info message "Waiting for Multipass daemon to be ready...", spinner is shown, command exits successfully (ReturnCode::Ok) once the daemon is ready (i.e., can connect to image servers) and the client begins retrieving the default release image for the multipass launch process. You may Ctrl+C here to terminate the launch process.

Points for Reviewers:

  • Confirmation that using wait_update_manifests_all_and_optionally_applied_force(true) is an appropriate and robust proxy for overall daemon readiness.
  • Feedback on the client-side error messages and user experience (spinner, timeout message) is appreciated. The current spinner implementation addresses @ricab's suggestion in #452.
  • Unit Tests: Per @ricab 's comments in issue #452, unit tests have not yet been implemented for this initial PR to facilitate earlier review. I am prepared to add them based on feedback.

rluna319 avatar Jun 06 '25 02:06 rluna319

Hey @rluna319, thank you for this! We'll review when we get a chance.

ricab avatar Jun 06 '25 11:06 ricab

Hey @levkropp, would you mind doing a secondary review of this one? Than I can do a very light "tertiary" review.

ricab avatar Jun 27 '25 15:06 ricab

@rluna319, could you also rebase on the latest main, to see if we can get some CI to run on this? Appreciate it.

ricab avatar Jul 18 '25 19:07 ricab

@ricab I had not included your requested changes on my latest commit. I had not seen them before I made the push, my fault. I did rebase on the latest main however.

I will work on your requested changes ASAP.

Would you like me to amend your requested changes to my latest commit or have it in a separate commit?

rluna319 avatar Jul 18 '25 20:07 rluna319

Hey @rluna319 no worries. We generally prefer additional commits, to make it easy to identify new changes. Thanks!

ricab avatar Jul 18 '25 21:07 ricab

@ricab I've added all of your requested changes to the recent commit. I also rebased on the latest main as well.

rluna319 avatar Jul 22 '25 18:07 rluna319

Hey @rluna319, it looks like you haven't signed our CLA yet. Would you mind doing so, please? https://ubuntu.com/legal/contributors

Also, there are a few linting problems. Editors often have options to strip trailing whitespace and keep a single newline at the end. You can check whitespace sanity on a diff with git diff --check .... For example: git diff --check main...wait-ready.

git clang-format might take care of whitespaces too, if you set it up. That might be a good idea anyway because it's going to run once the whitespace check succeeds.

ricab avatar Jul 31 '25 10:07 ricab

@ricab I signed the CLA, went through linter errors (and other CI build/test errors) and fixed them to the best of my knowledge, ran the code through git clang-format and git diff --check main. I've also rebased off the latest main.

rluna319 avatar Jul 31 '25 19:07 rluna319

I removed some commented out includes that shouldn't have been in that commit.

rluna319 avatar Jul 31 '25 20:07 rluna319

I see that I still missed some linter errors. I thought I ran everything through clang-format correctly. I'll address those.

rluna319 avatar Aug 01 '25 00:08 rluna319

@ricab Ive addressed the issues. It says some of the workflows are awaiting approval to be run.

rluna319 avatar Aug 07 '25 00:08 rluna319

@ricab Ive addressed the issues. It says some of the workflows are awaiting approval to be run.

Approved the workflow.

xmkg avatar Aug 07 '25 06:08 xmkg

Thank you @xmkg.

I think that the remaining workflows that fail are due to some internal CI issues. Something about a missing token. Is this something I need to fix or can fix on my end?

rluna319 avatar Aug 07 '25 19:08 rluna319

Thank you @xmkg.

I think that the remaining workflows that fail are due to some internal CI issues. Something about a missing token. Is this something I need to fix or can fix on my end?

Yeah, unfortunately, our CI is not happy with PRs coming from external forks. We're working on making this process smoother, but the necessary machinery is not yet in place. Would you mind if I create a PR for your PR, just for the sake of running the pipeline? We can track the progress and make changes here.

xmkg avatar Aug 08 '25 07:08 xmkg

Oh ok. Sure thing.

rluna319 avatar Aug 08 '25 16:08 rluna319

Opened https://github.com/canonical/multipass/pull/4304.

xmkg avatar Aug 14 '25 07:08 xmkg

@rluna319, we've landed a bunch of CI-related fixes to main. Could you rebase this PR?

xmkg avatar Aug 14 '25 07:08 xmkg

@xmkg I have rebased the PR. Workflows awaiting approval.

rluna319 avatar Aug 15 '25 16:08 rluna319

@xmkg I have rebased the PR. Workflows awaiting approval.

Great, thanks. Synced #4304.

xmkg avatar Aug 15 '25 19:08 xmkg

@xmkg Great thank you. I see all the CI tests have passed in #4304. Should I mind the coverage report? I've looked at it already and I can cover the line in wait_ready.cpp that it says isn't covered.

However, i'm having trouble covering the other one in daemon.cpp where we catch any other std::exception. Throwing a runtime error in update_manifests just causes the test to hang. I'm guessing its the way that exception is or isn't handled in the asynchronous task. I'm not sure what other exception would make it to that catch block. Any other exceptions that ive traced within the codepath are already handled locally. Perhaps i'm just missing something i'm not aware of.

rluna319 avatar Aug 15 '25 23:08 rluna319

@xmkg Great thank you. I see all the CI tests have passed in #4304. Should I mind the coverage report? I've looked at it already and I can cover the line in wait_ready.cpp that it says isn't covered.

That would be nice.

However, i'm having trouble covering the other one in daemon.cpp where we catch any other std::exception. Throwing a runtime error in update_manifests just causes the test to hang. I'm guessing its the way that exception is or isn't handled in the asynchronous task.

I'd say don't worry about it. There's a way to trigger that exception catcher... but it's a bit cheeky, though. We could use a logger that throws an exception on .log(). That would make the coverage tool happy, but I'm okay with it as-is, too.

xmkg avatar Aug 18 '25 11:08 xmkg

@xmkg I added the coverage for wait_ready.cpp.

I tried to implement your suggestion, throwing and exception on log(), but wasn't able to do it. I kept getting the error that the Logger has no member named "gmock_log". I thought maybe I would have to add a mock method for log() in mock_logger.cpp but I felt that would be overkill if coverage for that line wasn't crucial. I tried doing a local mock method for the logger as well but I couldn't get that to work either. I may just lack the experience to implement this properly.

rluna319 avatar Aug 18 '25 23:08 rluna319

Awesome! Thank you for allowing me to contribute @xmkg @ricab. A great experience, I learned a lot. Looking forward to the next!

rluna319 avatar Aug 20 '25 18:08 rluna319

Our pleasure @rluna319! This was a great contribution, very well pondered. Feel free to send more in!

ricab avatar Aug 21 '25 10:08 ricab