trafficserver
trafficserver copied to clipboard
In event processing, assert that mutex is unlocked before freeing mutex.
The lifetime of mutex used for TS continuations is controlled by the number of referring continuations, it must not be destroyed with TSMutextDestroy().
Also has some very minor cleanup of Ptr.h.
I think this looks good, has it been tested in prod?
There's problems with it, I'm working on a fix.
I think this looks good, has it been tested in prod?
Yes, now it has.
This fixed a crash Yahoo was seeing in txn_box. txn_box was destroying a continuation with a mutex in the continuation function (so the mutex was locked). Continuations have a (Ptr) reference to their mutex, and the continuation's reference was the only one to its mutex. So the mutex got deleted while it was locked.
@SolidWallOfCode says he will change this code to get the mutex from the continuation, and unlock it, before destroying the continuation. So it boils down to whether we want to idiot proof the mutex destroy, so it will unlock if necessary before destruction.
Example stack dump from crash. The line numbers will be off, it's Yahoo internal TS.
[ 00 ] libc-2.17.so raise ( raise.c:55 )
[ 01 ] libc-2.17.so abort ( abort.c:90 )
[ 02 ] libtscore.so.9.1.10 ink_abort ( ink_error.cc:99 )
[ 03 ] libtscore.so.9.1.10 ink_mutex_destroy ( ink_mutex.cc:72 )
[ 04 ] traffic_server free ( I_Lock.h:595 )
[ 05 ] traffic_server operator= ( Ptr.h:241 )
[ 06 ] traffic_server free_event ( P_UnixEThread.h:251 )
[ 07 ] traffic_server EThread::process_queue(Queue<Event, Event::Link_link>*, int*, int*) ( UnixEThread.cc:196 )
[ 08 ] traffic_server EThread::execute_regular() ( UnixEThread.cc:259 )
[ 09 ] traffic_server execute ( UnixEThread.cc:364 )
[ 10 ] traffic_server EThread::execute() ( UnixEThread.cc:342 )
[ 11 ] traffic_server spawn_thread_internal ( Thread.cc:91 )
[ 12 ] libpthread-2.17.so start_thread ( pthread_create.c:307 )
I think these lines of code may potentially cause this crash: https://github.com/SolidWallOfCode/txn_box/blob/dev-0-4/plugin/src/ts_util.cc#L955 https://github.com/SolidWallOfCode/txn_box/blob/dev-0-4/plugin/src/ts_util.cc#L981
The continuation has the only Ptr reference to the mutex. So, when the continuation is destroyed, the reference is cleared, causing the mutex to be destroyed, while it's still locked.
I think these lines of code may potentially cause this crash: https://git.ouryahoo.com/mirror-github/SolidWallOfCode--txn_box/blob/0.4.5/plugin/src/ts_util.cc#L1030 https://git.ouryahoo.com/mirror-github/SolidWallOfCode--txn_box/blob/0.4.5/plugin/src/ts_util.cc#L1003
The continuation has the only Ptr reference to the mutex. So, when the continuation is destroyed, the reference is cleared, causing the mutex to be destroyed, while it's still locked.
I think these links are internal to yahoo only.
I think these links are internal to yahoo only.
Oops, fixed.
[approve ci autest]
This pull request has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.