aaw icon indicating copy to clipboard operation
aaw copied to clipboard

OpenM++ Connect Issues

Open YannCoderre opened this issue 3 years ago • 2 comments

Every 5-10 mins, the screen will jump back to the “Connect” page and the user has to click “connect” to return to OpenM++. It’s not a big issue since it doesn’t require putting in login information and what needed is just a click, but may be something that could be fixed.

YannCoderre avatar Oct 11 '22 17:10 YannCoderre

@YannCoderre and @chuckbelisle, this morning at 10;30 (ish) I was connected to AAW using a VM in the OncoSim namespace. I was using the OpenM UI to look at parameter values in OncoSim (not doing any runs). The connection closed suddenly, and took me back to the VNC connect screen.

rochellegarner avatar Oct 12 '22 15:10 rochellegarner

@YannCoderre and @chuckbelisle, the disconnect happened again under similar circumstances as described in my previous comment (11:20 am)

rochellegarner avatar Oct 12 '22 15:10 rochellegarner

Scheduling this to be looked at for our next sprint. Details compiled below Namespace: OncoSim Notebook/server:

Steps to reproduce the behavior:

  1. Connect to remote desktop server
  2. Using the OpenM UI
  3. look at parameter values in OncoSim (not doing any runs)

NoVNC session suddenly disconnected and was required to reconnect

chuckbelisle avatar Oct 26 '22 15:10 chuckbelisle

noVNC debug log provided by @rochellegarner:

Fri Oct 28 19:16:57 2022
Connections: accepted: /tmp/vnc-socket/vnc-468KTN.sock
SConnection: Client needs protocol version 3.8
SConnection: Client requests security type None(1)
VNCSConnST:  Server default pixel format depth 24 (32bpp) little-endian rgb888
VNCSConnST:  Client pixel format depth 24 (32bpp) little-endian bgr888
Warning: 'sandbox' is not in the list of known options, but still passed to Electron/Chromium.
[90m[main 2022-10-28T19:18:30.891Z][0m update#setState idle
[8934:1028/191830.962399:ERROR:buffer_manager.cc(488)] [.DisplayCompositor]GL ERROR :GL_INVALID_OPERATION : glBufferData: <- error from previous GL command
(node:8989) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
(node:8989) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
(node:8989) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
(node:8989) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
(node:8989) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
(node:8989) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
Unable to revert mtime: /usr/local/share/fonts
[8934:1028/191833.932069:ERROR:buffer_manager.cc(488)] [.DisplayCompositor]GL ERROR :GL_INVALID_OPERATION : glBufferData: <- error from previous GL command
(node:9050) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
(node:9050) Electron: Loading non context-aware native modules in the renderer process is deprecated and will stop working at some point in the future, please see https://github.com/electron/electron/issues/18397 for more information
[90m[main 2022-10-28T19:19:00.901Z][0m update#setState checking for updates
[90m[main 2022-10-28T19:19:00.906Z][0m update#setState available for download

Additional error:

ERROR:gles2_cmd_decoder.cc(4927)] [.RendererMainThread-0x2bf372313a00]GL ERROR :GL_INVALID_FRAMEBUFFER_OPERATION : glClear: framebuffer incomplete

vexingly avatar Oct 28 '22 19:10 vexingly

Waiting to test on updated cluster with newest images...

vexingly avatar Nov 08 '22 16:11 vexingly

New image / kubeflow 1.6 does not appear to have resolved this issue, now that the cluster has stabilized after the upgrade, we will attempt to collect more data on the issue!

vexingly avatar Nov 16 '22 15:11 vexingly

@vexingly is this issue planned to be worked on this sprint and what are the next steps?

chuckbelisle avatar Nov 30 '22 15:11 chuckbelisle

@chuckbelisle, I know that @vexingly and I have been doing some testing. If there are additional things I should test, @vexingly , please let me know.

rochellegarner avatar Dec 05 '22 15:12 rochellegarner

Hi @rochellegarner, we have submitted an issue to the cloud team regarding an unstable component of our clusters networking and are currently waiting for feedback on this. We're not sure if this is responsible for the disconnection issues but it cannot hurt to increase the stability of the network.

vexingly avatar Dec 05 '22 16:12 vexingly

@rochellegarner we made some progress on the stability yesterday afternoon, although the cloud team is still looking at high resource usage within the networking in the cluster...

Could you let me know if there is any change in the disconnections today, better or worse? If it is possible to keep track of how frequent the disconnections are (i.e. every 4-5 minutes) that may help identify incremental improvements in the stability. Thanks!

vexingly avatar Dec 08 '22 16:12 vexingly

Thanks @vexingly. My preliminary testing seems to indicate that the disconnect issue is better. I was working in the environment for a while, and didn't get disconnected. Yay!!! I will ask our colleagues at CPAC to report back to us if things have improved.

rochellegarner avatar Dec 12 '22 17:12 rochellegarner