qlcplus icon indicating copy to clipboard operation
qlcplus copied to clipboard

A crash when using the Web interface

Open awwright opened this issue 2 years ago • 15 comments

I was playing around with the Web interface and after pressing lots of buttons, qlcplus crashed (SIGABRT from a failed assertion). It appears to be random, but I was able to eventually reproduce the crash inside gdb:

[Thread 0x7fffeb7fe640 (LWP 45970) exited]
ASSERT: "uint(i) < uint(size())" in file /usr/include/x86_64-linux-gnu/qt5/QtCore/qbytearray.h, line 500

Thread 1 "qlcplus" received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737260009536) at pthread_kill.c:44
44	pthread_kill.c: No such file or directory.
(gdb) backtrace
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737260009536) at pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737260009536) at pthread_kill.c:80
#2  __GI___pthread_kill (threadid=140737260009536, signo=signo@entry=6) at pthread_kill.c:91
#3  0x00007ffff65fc476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff65e27b7 in __GI_abort () at abort.c:79
#5  0x00007ffff6aa3ba3 in  () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#6  0x00007ffff6aa2ff0 in qt_assert_x(char const*, char const*, char const*, int) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#7  0x00007ffff7e28044 in QByteArray::at(int) const (this=0x7fffffffdf90, i=2) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qbytearray.h:500
#8  0x00007ffff7677a83 in QHttpConnection::webSocketRead(QByteArray) (this=0x7fff18003360, data=...) at qhttpserver/qhttpconnection.cpp:440
#9  0x00007ffff7676419 in QHttpConnection::parseRequest() (this=0x7fff18003360) at qhttpserver/qhttpconnection.cpp:129
#10 0x00007ffff76a2f7f in QHttpConnection::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)
    (_o=0x7fff18003360, _c=QMetaObject::InvokeMetaMethod, _id=4, _a=0x7fffffffe0c0) at moc_qhttpconnection.cpp:118
#11 0x00007ffff6d03a88 in  () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#12 0x00007ffff576502f in  () at /lib/x86_64-linux-gnu/libQt5Network.so.5
#13 0x00007ffff5778359 in  () at /lib/x86_64-linux-gnu/libQt5Network.so.5
#14 0x00007ffff70dc6b3 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /lib/x86_64-linux-gnu/libQt5Widgets.so.5
#15 0x00007ffff6ccc16a in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#16 0x00007ffff6d26155 in  () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#17 0x00007ffff4f1a8bb in g_main_context_dispatch () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#18 0x00007ffff4f6df08 in  () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#19 0x00007ffff4f18003 in g_main_context_iteration () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#20 0x00007ffff6d25548 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#21 0x00007ffff6ccaa9b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#22 0x00007ffff6cd3024 in QCoreApplication::exec() () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#23 0x0000555555559fc8 in main(int, char**) (argc=6, argv=0x7fffffffe718) at main.cpp:400

System is Ubuntu 21.10, Linux kernel 5.13.0-22-generic x86_64, Qt 5.15.2

The line of code in question appears to be: https://github.com/mcallegari/qlcplus/blob/66069180c4d60d415114849077ffbf28613b212b/webaccess/src/qhttpserver/qhttpconnection.cpp#L440

awwright avatar Dec 27 '21 10:12 awwright

Hi, can you please share the project file you're using to reproduce the crash? It's clearly a byte array segfault, but I need to reproduce the exact condition to understand the original cause. Thanks

mcallegari avatar Jan 12 '22 07:01 mcallegari

I've tried building out a new showfile and I've had difficulty reproducing it, I got it to crash once from an iPad, just a single scene, single fixture, single button. I suspect it can be easily reproduced by splitting up the payloads of the TCP stream into one packet per byte, instead of having an entire event payload available at the same time (in the same packet). I'm trying to figure out if there's an easy way to fragment a TCP stream like this.

awwright avatar Mar 02 '22 01:03 awwright

It seems to be a problem when my wireless signal drops too low—sitting right next to my router I cannot cause any error, but as I mash buttons and walk farther away, I get the same error. Maybe it's packet corruption that's undetected? Or some sort of fragmentation?

I'll upload the show file when I get home, but it was just 10 or so identical scenes with a single fixture, and an array of buttons so I could mash them on my iPad. The browser is Safari on an iPad Pro (12.9-inch) (5th generation), iPad OS 15.4.1.

awwright avatar May 18 '22 21:05 awwright

This crash reliably occurs when the browser sends a "pong" packet over Wi-Fi. It doesn't occur over wired, or with text frames. I can get it to occur from Firefox and Safari on macOS.

awwright avatar May 20 '22 00:05 awwright

The WebSocket pong frame is coming in over two TCP packets, e.g.

< 8a80
< 230f54cd

Perhaps the pong implementation is making two separate unbuffered writes. I'm not sure why this isn't a problem with all pong packets (this usually works, even over Wi-Fi). In any event, the QLC+ server doesn't have a buffer to deal with fragmented packets. A fix like #1335 works here.

awwright avatar May 20 '22 03:05 awwright

Interesting finding. The websocket implementation is written by me, following the RFC...so it's possible there are still things to fix. I'll have a look at the code too

mcallegari avatar May 20 '22 08:05 mcallegari

Your websocket implementation works very well, it's just expecting the entire frame to be available at once... another workaround is that it returns how many bytes were read, and if a full packet is not detected, it returns 0. But I'm not sure the best way to do that in the QT sockets library... if there were a peekAll function, we could use that, then call skip() on the number of bytes read.

I went with the state machine approach because some other WebSocket implementations seem to use the state machine method, and the WebSocket protocol seems optimized for that.

awwright avatar May 20 '22 12:05 awwright

@mcallegari Why not use https://doc.qt.io/qt-5/qwebsocket.html?

And what do you think of pulling in a test suite like https://github.com/crossbario/autobahn-testsuite?

awwright avatar May 20 '22 14:05 awwright

I did evaluate QtWebsocket but over the years my experience with Qt has been a few highs and a lot of lows. So I tend to keep the external dependencies as lower as possible even though this means more development effort (but peaceful nights...)

mcallegari avatar May 26 '22 08:05 mcallegari

I made a comparison of the two PRs with the current master, qlc+-websocket-parsing.pdf

The tests that are missing are either because they would crash QLC+, or because it crashed during the test.

  • Also I removed length limits for the test, which is why 1.1.6-1.1.8 pass in this run.

awwright avatar May 30 '22 02:05 awwright

Just want to add that I also experience crashes when using the web interface over wifi.

BlackYps avatar Aug 30 '23 15:08 BlackYps