bmcweb icon indicating copy to clipboard operation
bmcweb copied to clipboard

Webui pages intermittently not loading, "0x0 Failed to capture connection" logged

Open zevweiss opened this issue 10 months ago • 7 comments

Is this the right place to submit this?

  • [X] This is not a security vulnerability or a crashing bug
  • [X] This is not a question about how to use OpenBMC
  • [X] This is not a bug in an OpenBMC fork or a bug in code still under code review.
  • [X] This is not a request for a new feature.

Bug Description

It's a bit unpredictable, but I'm seeing sporadic instances of webui-vue page reloads stopping without actually loading anything and leaving me at an empty page. It's usually (I suspect always, but I'm not 100% certain) accompanied by one or more of the following journal messages from bmcweb:

[CRITICAL http_connection.hpp:585] 0x0 Failed to capture connection

The errors seem to crop up more readily if I refresh a page while it's still actively loading (e.g. just hitting ctrl-R repeatedly in a browser without waiting for it to fully load in between), but also happens sometimes when refreshing an idle, fully loaded page.

Version

OpenBMC commit f0053a50e6a423e12b68673c89b53938346a3af6 (bmcweb commit ac25adb8d491342fc5fd4e189c58b79be6f5835a).

Additional Information

I observed the problem on spc621d8hm3 and romed8hm3.

I'm not sure if it's related, but I've also seen some other CRITICAL errors logged occasionally (much less frequently), such as:

[CRITICAL error_messages.cpp:290] Internal Error /usr/src/debug/bmcweb/1.0+git/redfish-core/lib/account_service.hpp(1716:36) `redfish::handleAccountGet(App&, const crow::Request&, const std::shared_ptr<bmcweb::AsyncResp>&, const std::string&)::<lambda(const boost::system::error_code&, const dbus::utility::ManagedObjectType&)>`:

(just ending in a colon like that, looks like there'd be a string with more information but it's empty.)

zevweiss avatar Apr 16 '24 20:04 zevweiss

Can you provide the contents of the network tab in chrome/firefox when this error happens? It looks very similar to something being worked already, that involves a race condition in webui-vue. https://discord.com/channels/775381525260664832/1219121974173896826

A "fix" is to remove the websocket handler entirely: https://gerrit.openbmc.org/c/openbmc/webui-vue/+/70641

edtanous avatar Apr 16 '24 21:04 edtanous

Not sure if there's a better form for this than a screenshot, but here's one capture: webui-hang

And yes, after applying 70641 it does seem like the problem goes away -- though FWIW I do have -Drest=enabled in bmcweb.

zevweiss avatar Apr 19 '24 09:04 zevweiss

-Drest=enabled

https://gerrit.openbmc.org/c/openbmc/webui-vue/+/70641 removed the /subscribe websocket

gtmills avatar Apr 29 '24 16:04 gtmills

@zevweiss can you verify that bmcweb master + webui-vue master solves your issues?

edtanous avatar Apr 30 '24 21:04 edtanous

Alas no, but for different reasons -- with bmcweb 5ffd11f248f1 and webui-vue 01492c3dcb I can't log in at all. Attempting to do so (entering a valid username & password) just repeatedly reloads the login page.

reqs_000

zevweiss avatar May 02 '24 20:05 zevweiss

I haven't seen that before.... anything unique about your setup?

edtanous avatar May 07 '24 03:05 edtanous

I've got a bbappend with the following contents, but I think that's it:

python() {
    d.setVar("BMCWEB_HTTP_BODY_LIMIT", str((int(d.getVar("FLASH_SIZE")) // 1024) + 2))
}

EXTRA_OEMESON:append = " \
    -Dhttp-body-limit=${BMCWEB_HTTP_BODY_LIMIT} \
    -Drest=enabled \
    "

FWIW, I just tested again with webui-vue 2b33526c41c and still see the same problem.

zevweiss avatar May 07 '24 04:05 zevweiss

After bisecting the login-failure problem, it seems to have been introduced by bmcweb commit 25b54dba775b31021a3a4677eb79e9771bcb97f7.

zevweiss avatar May 08 '24 16:05 zevweiss

...and I can reproduce it on current openbmc master (commit c9e483ca4eb67ac212b764f1f7dec8588af72f19) building evb-ast2500 and booting it in qemu:

$ qemu-system-arm -M ast2500-evb \
    -drive file=obmc-phosphor-image-evb-ast2500.static.mtd,format=raw,if=mtd \
    -nographic -serial mon:stdio -nic user,hostfwd=tcp::6443-:443,model=ftgmac100

After rolling bmcweb back to commit aca174983be5a0d2af08044dd93487908ae6cfe5 (the commit before 25b54dba775b31021a3a4677eb79e9771bcb97f7) I can log in to the web UI.

zevweiss avatar May 08 '24 21:05 zevweiss

Oh, and after seeing it go by in #gh-issues on discord, looks like https://github.com/openbmc/webui-vue/issues/116 is reporting the same problem.

zevweiss avatar May 08 '24 21:05 zevweiss

Please try: https://gerrit.openbmc.org/c/openbmc/bmcweb/+/71309

edtanous avatar May 08 '24 23:05 edtanous

Alright, patch 71309 does appear to resolve the webui login failure problem, thanks -- and with that bmcweb and the latest webui-vue (commit dfba4e542e8167) I haven't been able to reproduce the original problem with pages sometimes not loading. Though I now seem to have neither client-side polling nor events sent from the server notifying the web UI of host power state changes AFAICT (power state shown in the page header gets stale and doesn't update until I do a full page reload). Is that currently expected?

zevweiss avatar May 09 '24 00:05 zevweiss

Is that currently expected?

Yes. The websocket had to be removed because it was crashing (also, because it relies on the deprecated /subscribe dbus API, nobody really wanted to look into why it was crashing). Hopefully someone has the willpower to go implement SSE, or polling, but right now we're just left with the power state not updating live.

edtanous avatar May 09 '24 00:05 edtanous

Ah, alright -- for some reason I had thought that polling had already been implemented in webui-vue, but I guess not yet.

zevweiss avatar May 09 '24 03:05 zevweiss