3.1.9 seems to break the Shell app due to CSRF token issue
After updating from 3.1.7 to 3.1.9, the shell app immediately crashes with the message:
Your connection to the remote server has been terminated.
Looking at the logs I found
App 4042244 output: Listening on 3000
App 4042244 output: Connection established
App 4042244 output: /var/www/ood/apps/sys/shell/app.js:161
App 4042244 output: token = req.url.match(/csrf=([^&]*)/)[1];
App 4042244 output: ^
App 4042244 output:
App 4042244 output: TypeError: Cannot read properties of null (reading '1')
App 4042244 output: at WebSocketServer.connection (/var/www/ood/apps/sys/shell/app.js:161:40)
App 4042244 output: at WebSocketServer.emit (node:events:517:28)
App 4042244 output: at done (/var/www/ood/apps/sys/shell/app.js:233:9)
App 4042244 output: at WebSocketServer.completeUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:435:5)
App 4042244 output: at WebSocketServer.handleUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:343:10)
App 4042244 output: at Server.upgrade (/var/www/ood/apps/sys/shell/app.js:232:7)
App 4042244 output: at Server.emit (node:events:517:28)
App 4042244 output: at onParserExecuteCommon (node:_http_server:939:14)
App 4042244 output: at onParserExecute (node:_http_server:825:3)
I made it print out req.url and it's the bare URL to the shell app without a CSRF token.
3.1.7 does not have this issue since downgrading fixes it.
I can't replicate. Note that it's the javascript making this network call with a CSRF token, not the original URL you put in your browser.
Here's an image of my console tab for the same. You'll note that the first request is what you enter in the URL (and the response is an html page), then later the browser makes another request to basically the same URL only with wss protocol instead of http.
Didn't mean to close the ticket, only comment on it. Can you show a similar network tab?
I double checked the diff between the two versions this morning and can't find/see how the ping pong change changed/broke functionality to that point.
I'd also ask if you can try 3.1.9 with a private window or after clearing your cache to see if that's any different.
https://github.com/OSC/ondemand/compare/v3.1.7...v3.1.9
Given the changelog, I'd pin this on updates we had to make to our lua code to support the latest version of httpd (or apache2 as the case may be). What apache2/httpd version are you using?
Here's a screenshot - I do see both the original request and the follow-up wss request.
I'm seeing the same behavior with incognito after clearing cache and doing a hard reload.
We are using httpd 2.4.37 from RHEL 8. I'm viewing it with Chrome on Windows 10 22H2.
Quite a few reds and oranges in the console...
Edit: The websocket error is the only one that doesn't show up in 3.1.7.
Detail:
Unfortunately I can't replicate on Rocky 8.8. Even 8.8 has httpd-2.4.37-65. I'll try to track down a system that has httpd-2.4.37-56, but I'm guessing the fixes we put in place for 2.4.62 somehow broke httpd-2.4.37-56 (assuming that's the version you're on).
I'll also have to look into the difference between those 56 and 65 releases.
Yes, I have 2.4.37-56. I'll also try to find something different to test on.
Is it possible to update httpd? I'm unable to get an OS with that version. Everything on appstream has been updated.
I assume you're on some EL/8.8 variant?
@johrstrom You can enable the Vault for Rocky to get older packages, ie: https://dl.rockylinux.org/vault/rocky/8.8/AppStream/x86_64/os/Packages/h/. Has https://dl.rockylinux.org/vault/rocky/8.8/AppStream/x86_64/os/Packages/h/httpd-2.4.37-56.module%2Bel8.8.0%2B1456%2Bd0a01c5e.7.x86_64.rpm.
You can inject this URL into something like /etc/yum.repos.d/Rocky-AppStream.repo and comment out mirrorlist and set the Vault URL at https://dl.rockylinux.org/vault/rocky/8.8/AppStream/x86_64/os/ to the value for baseurl.
@johrstrom You can enable the Vault for Rocky to get older packages
Thanks, I'll take a look at it.
OK I can replicate if I downgrade back to 2.4.37-56. It is infact coming from the lua updates we made to support 2.4.62...
But that brings another question - If I have to install the version from the vault should we even support it? Is there a way to safely require httpd >= 2.4.37-56? The patch numbers could vary from distro to distro in theory, but do they in practice?
Latest versions for EL8:
Rocky 8 - 2.4.37-65 Alma 8 - 2.4.37-65 RHEL 8 - 2.4.37-65
The checksum and other parts of the appstream release do differ but I think for RHEL8 (and 9 most likely) we could require a minimum release based on what works with OnDemand.
Where this could run into issues is folks on RHEL 8 can pin to like RHEL 8.8 and if we were to require something from RHEL 8.10, that would break for them but also if that requirement is based on what works for OnDemand, they have no choice but to upgrade.
If 2.4.37-56 is the issue I think we'd have to say:
Requires: > 2.4.37-56 < 2.5
https://github.com/OSC/ondemand-packaging/blob/f777543e6818c47063ab1c1f4287493b32b634cb/packages/ondemand-runtime/rpm/ondemand-runtime.spec#L108
Where this could run into issues is folks on RHEL 8 can pin to like RHEL 8.8 and if we were to require something from RHEL 8.10
Surprisingly enough, dnf update updates from 8.8 to 8.10. But getting a 8.8 container this morning without running dnf update installed httpd-2.4.37-65 from appstream even though it was an 8.10 module.
How can I test pinning the OS to 8.8?
I found these 2 advisories, so I'm a bit convinced we should just require that -65 patch version. I can't imagine someone saying yes I need pin httpd to this version with known security advisories against it - and if so, then they can just stay on 3.1.7?
https://access.redhat.com/errata/RHSA-2024:5193 https://access.redhat.com/errata/RHSA-2024:4197
Sounds like a plan. I'll update to RHEL 8.10.
Sounds like a plan. I'll update to RHEL 8.10.
You don't need to fully upgrade to 8.10. Just an update to httpd should suffice.
We have a mixture of RHEL 8.8 EUS, RHEL 8.10 and RHEL/Rocky 9.4; this would be a good reason to move things off 8.8 EUS anyway :)
How can I test pinning the OS to 8.8?
This is done with subscription manager on RHEL systems as RHEL provides minor release repos. The only way to stay on a minor release on Rocky that is not the latest minor release would be to start with 8.8 container then replace all repos with 8.8 Vault URLs.
On RHEL you'd do
sudo subscription-manager release --set=8.8
Which would update the repo URLs in /etc/yum.repos.d/redhat.repo. I don't think this really works with Rocky/Alma since subscription-manager is designed to talk to RedHat systems about the system's valid subscription with RedHat.
I see. Yea then a locked RHEL 8.8 system won't support 3.1.9. Hrrrrm. OK that may just be the best we can do. I can't figure out how to get the apache version in lua without shelling out a rpm -q httpd query and parsing it or similar which seems very complicated and moreover would slow down basically every request. So much so that it seems untenable.
get the apache version in lua without shelling out a
rpm -q httpdquery and parsing it
Ya that sounds not great. The release version only exists inside the RPM metadata, I don't think Apache itself knows its 2.4.x-y it only knows it's 2.4.x.
I'm not sure if it matters, but I wanted to note that I hit this bug on a Rocky 9.4 host with httpd-2.4.57-8.el9.x86_64. We have a development host also running Rocky 9.4 that updated OnDemand at the same time, but it has httpd-2.4.57-11.el9_4.1.x86_64 and did not exhibit the problem. We fixed it by downgrading the OnDemand package since that upgrade was inadvertent, but just want to make sure you knew that this problem existed on Rocky/RHEL 9 as well as 8.
Rocky 9.4 host with httpd-2.4.57-8.el9.x86_64
It certainly does matter because that means specifying Requires: > 2.4.37-56 < 2.5 isn't a silver bullet I thought it may be.
It certainly does matter because that means specifying
Requires: > 2.4.37-56 < 2.5isn't a silver bullet I thought it may be.
We can have RHEL8 and RHEL9 specific requirements. We already do that for things like NodeJS because of different versions available:
https://github.com/OSC/ondemand-packaging/blob/8abf784f09b65b1392651fd0694155d62b8d0e29/packages/ondemand-runtime/rpm/ondemand-runtime.spec#L88-L96
My guess is whatever patch RedHat backported, it was applied to 2.4.37 for RHEL8 and 2.4.57 for RHEL9 as RHEL tends to pin their software versions though less so with AppStream
So we could do
%if 0%{?rhel} == 8
Requires: %{apache} > 2.4.37-56, %{apache} < 2.5
...
%endif
%if 0%{?rhel} == 9
Requires: %{apache} > 2.4.57-8, %{apache} < 2.5
...
%endif
I am having this issue, with the following installation:
httpd-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64
$ cat /etc/redhat-release Rocky Linux release 8.10 (Green Obsidian)
App 2203442 output: /var/www/ood/apps/sys/shell/app.js:161 App 2203442 output: token = req.url.match(/csrf=([^&]*)/)[1]; App 2203442 output: ^ App 2203442 output: App 2203442 output: TypeError: Cannot read properties of null (reading '1') App 2203442 output: at WebSocketServer.connection (/var/www/ood/apps/sys/shell/app.js:161:40) App 2203442 output: at WebSocketServer.emit (node:events:517:28) App 2203442 output: at done (/var/www/ood/apps/sys/shell/app.js:233:9) App 2203442 output: at WebSocketServer.completeUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:435:5) App 2203442 output: at WebSocketServer.handleUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:343:10) App 2203442 output: at Server.upgrade (/var/www/ood/apps/sys/shell/app.js:232:7) App 2203442 output: at Server.emit (node:events:517:28) App 2203442 output: at onParserExecuteCommon (node:_http_server:939:14) App 2203442 output: at onParserExecute (node:_http_server:825:3) App 2203442 output: App 2203442 output: Node.js v18.20.2
From the browser it looks fine:
token = window.urlWithToken().match(/csrf=([^&]*)/)[1]; 'AVPiWh3i-OEzDDpVlDgafakedatafakedata' window.urlWithToken() 'wss://bunya-ondemand.rcc.uq.edu.au/pun/sys/shell/ssh/bunya.rcc.uq.edu.au?csrf=AVPiWh3i-OEzDDpVlDgafakedatafakedata'
So, potentially, something is preventing the token from being passed back to nodejs or otherwise mangling that URL in such a way that it is not being detected correctly.
So, potentially, something is preventing the token from being passed back to nodejs or otherwise mangling that URL in such a way that it is not being detected correctly.
The update to apache should have resolved this... Did you bounce apache after updating to ensure that the new version is running? That's the only thing I can think of. I know for sure this issue is caused by the 3.1.9 update using a version of apache that's lower than httpd-2.4.37-65 (what you've reported to have).
Yes, and I just did a systemctl restart again to be on the safe side.
11:40 sbw@bunya-ondemand:~$ rpm -qa | grep httpd httpd-filesystem-2.4.37-65.module+el8.10.0+1830+22f0c9e0.noarch httpd-devel-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64 httpd-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64 httpd-tools-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64 rocky-logos-httpd-86.3-1.el8.noarch
11:41 sbw@bunya-ondemand:~$ sudo /usr/sbin/httpd -v Server version: Apache/2.4.37 (Rocky Linux) Server built: Jul 1 2024 14:01:13
I'm so sorry, but I can't replicate our systems or in a container with Rocky 8. httpd-2.4.37-65 (at least in a container when I test) should work well for you.
The updates we made (that apparently broke older httpds) were to support higher versions of httpd. So we can't really go back on them/revert them. I'm not really sure what to do because it really should be working for you, I'm just not sure why. You haven't edited those lua files have you? Just trying to cover what else could be the issue.