lxd icon indicating copy to clipboard operation
lxd copied to clipboard

UI: network CRUD on localhost produce error (but succeed) in chrome

Open edlerd opened this issue 2 years ago • 12 comments

  • Distribution: 5.19
  • Distribution version: ubuntu snap

Issue description

This is reproducible in chrome when talking to LXD on localhost directly via https://localhost:8443/ui/ This does not happen in Firefox or when talking to a LXD instance on another host.

Any change to a network device via the UI (create, update, delete) is succeeding. But Chrome cancels the network request to create (or upload, delete) and the UI reports an error: Failed to fetch. The dev console shows an error for that request: (failed) net::ERR_NETWORK_CHANGED.

There is a workaround in the UI with PR https://github.com/canonical/lxd-ui/pull/512 catching the error and checking if the requested network change was successful with a manual follow-up request. Ideally, the workaround would not be needed.

Steps to reproduce

  1. Setup LXD with UI on localhost
  2. Open the LXD-UI for the LXD backend on localhost in chrome: https://localhost:8443/ui/
  3. Open the dev console (F12)
  4. Go to Networks > Create network, enter a name and click "Create"
  5. You get an error Network creation failed / Failed to fetch.
  6. Investigate the request in the dev console.

edlerd avatar Oct 27 '23 10:10 edlerd

Is this in the wrong repo?

tomponline avatar Oct 27 '23 10:10 tomponline

The linked workaround is invalid by the looks of it.

tomponline avatar Oct 27 '23 10:10 tomponline

@tomponline we were chatting and I recommended to create it here. I'll have a look.

roosterfish avatar Oct 27 '23 10:10 roosterfish

I'm not really following the issue description, how is it a LXD issue?

tomponline avatar Oct 27 '23 10:10 tomponline

Linked the wrong PR initially, this is the right one: https://github.com/canonical/lxd-ui/pull/512

It seems to me some behaviour of LXD on the network change is triggering chrome to cancel running requests. Maybe there is a way for LXD to avoid this behaviour from chrome. Hence, filing the issue here in this repo.

edlerd avatar Oct 27 '23 10:10 edlerd

I can confirm that if you don't use the local LXD server this issue does not appear (tested with a nested LXD in a container). However when targeting the local LXD the issue also appears if you bind it to various local IPs other than localhost (doesn't matter if IPv4 or IPv6).

It's still unclear what is causing this.

roosterfish avatar Oct 27 '23 16:10 roosterfish

@simondeziel would you be interested in looking into this?

tomponline avatar Oct 29 '23 12:10 tomponline

Maybe also related to https://github.com/canonical/lxd/issues/12482

tomponline avatar Oct 31 '23 20:10 tomponline

@edlerd what OS version are you using with Chrome btw?

tomponline avatar Oct 31 '23 20:10 tomponline

@edlerd what OS version are you using with Chrome btw?

OS: Ubuntu 22.04.3 LTS Chrome: Version 118.0.5993.117 (Official Build) (64-bit)

edlerd avatar Oct 31 '23 20:10 edlerd

@MggMuggins I wonder if you have chance, could you see if you can reproduce this?

@edlerd or is this no longer an issue?

tomponline avatar Oct 21 '24 15:10 tomponline

I can still reproduce it today:

image

Though priority is not so high, as we have a workaround in the UI. We check if the request was successful with a follow-up api call in case the original request fails for any network modifying requests. It would be nice to have this resolved and be able to remove that workaround.

edlerd avatar Oct 21 '24 19:10 edlerd

I've reproduced this in Chromium with the bridge driver, done some reading and a few experiments. When creating/updating a network, the bridge driver with default settings does these:

  • ip link add type bridge name lxdbr2 ...
  • flush the routing table for the device
  • create firewall rules

I wrote a little program which does these things over an https endpoint and chromium is not failing to complete the request.

I also took a brief look at the Chromium source code and there's a number of places the error can be returned; it's not clear to me what from the kernel is being detected.

MggMuggins avatar Oct 24 '24 21:10 MggMuggins

@MggMuggins @edlerd is there anything in chromes log files that may hint as to what it is taking umbrage with?

See https://support.google.com/chrome/thread/138398422/frequent-err-network-changed-errors?hl=en

tomponline avatar Oct 25 '24 07:10 tomponline

Logs show this:

[13901:13918:1025/091227.724438:VERBOSE1:logging_network_change_observer.cc(90)] Observed a change to the network IP addresses

I'm no C++ programmer so I'm still not making a lot of sense out of the Chrome source, but there were enough crumbs. I was able to generate an ERR_NETWORK_CHANGED by satisfying the following conditions:

  • HTTPS request
  • adding an IP to an iface
  • long running request: if the request completes too quickly then it will finish before Chrome detects the network change

Performing the request over HTTP does not generate the error. I've updated my gist with example code.

Adjusting IP addresses on interfaces is essential for a bunch of network operations. Completing requests faster than Chrome detects the changed IP isn't viable, and I don't see any other way around this. @edlerd is it alright if we keep the UI workaround and close this?

MggMuggins avatar Oct 25 '24 22:10 MggMuggins

If there is nothing we can do to improve, we should keep the current workaround, yes.

edlerd avatar Oct 27 '24 08:10 edlerd