trafficserver
trafficserver copied to clipboard
Intermittent crash with ssntxnorder_verify.so (AuTest openclose_h2)
The current master branch intermittently crashes with ssntxnorder_verify.so which is a plugin for the openclose_h2 AuTest.
backtrace
(lldb) bt
* thread #3, name = '[ET_NET 1]', stop reason = signal SIGABRT
* frame #0: 0x00007fff204d3946 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff20502615 libsystem_pthread.dylib`pthread_kill + 263
frame #2: 0x00007fff20457411 libsystem_c.dylib`abort + 120
frame #3: 0x00000001011f12de libtscore.10.dylib`ink_abort(message_format="%s:%d: failed assertion `%s`") at ink_error.cc:99:3
frame #4: 0x00000001011ed58c libtscore.10.dylib`::_ink_assert(expression="false", file="ProxySession.cc", line=129) at ink_assert.cc:37:3
frame #5: 0x00000001003940e4 traffic_server`ProxySession::state_api_callout(this=0x0000000106810800, event=104, data=0x000000017a01fff0) at ProxySession.cc:129:5
frame #6: 0x000000010000664e traffic_server`Continuation::handleEvent(this=0x0000000106810800, event=104, data=0x000000017a01fff0) at I_Continuation.h:219:12
frame #7: 0x000000010043263e traffic_server`read_signal_and_update(event=104, vc=0x000000017a01fe00) at UnixNetVConnection.cc:83:24
frame #8: 0x000000010043254d traffic_server`read_signal_done(event=104, nh=0x000000011814c140, vc=0x000000017a01fe00) at UnixNetVConnection.cc:150:7
frame #9: 0x0000000100432513 traffic_server`UnixNetVConnection::readSignalDone(this=0x000000017a01fe00, event=104, nh=0x000000011814c140) at UnixNetVConnection.cc:1037:11
frame #10: 0x00000001003cd1cf traffic_server`SSLNetVConnection::net_read_io(this=0x000000017a01fe00, nh=0x000000011814c140, lthread=0x0000000118148000) at SSLNetVConnection.cc:734:5
frame #11: 0x000000010041f67d traffic_server`NetHandler::process_ready_list(this=0x000000011814c140) at UnixNet.cc:415:11
frame #12: 0x000000010041fdc5 traffic_server`NetHandler::waitForActivity(this=0x000000011814c140, timeout=60000000) at UnixNet.cc:546:3
frame #13: 0x0000000100464ddb traffic_server`EThread::execute_regular(this=0x0000000118148000) at UnixEThread.cc:303:14
frame #14: 0x0000000100465225 traffic_server`EThread::execute(this=0x0000000118148000) at UnixEThread.cc:364:11
frame #15: 0x0000000100463520 traffic_server`spawn_thread_internal(a=0x0000000101c15e70) at Thread.cc:92:12
frame #16: 0x00007fff20502954 libsystem_pthread.dylib`_pthread_start + 224
frame #17: 0x00007fff204fe4a7 libsystem_pthread.dylib`thread_start + 15
client
h2load -n 16 -m 1 -c 16 -t 8 https://127.0.0.1:4443/
ATS
plugin.config
ssntxnorder_verify.so
I'm not sure this is related or not, but this do_io_read() call breaks the relation of Http2ClientSession and Http2Stream.
* thread #3, name = '[ET_NET 1]', stop reason = step over
frame #0: 0x00000001000d49b2 traffic_server`HttpSM::state_add_to_list(this=0x000000010780dc00, event=0, (null)=0x0000000000000000) at HttpSM.cc:472:28
469 // Preserve the current read cont and mutex
470 NetVConnection *netvc = ((ProxyTransaction *)ua_entry->vc)->get_netvc();
471 ink_assert(netvc != nullptr);
-> 472 ua_entry->read_vio = ua_entry->vc->do_io_read(netvc->read_vio_cont(), 0, nullptr);
473 }
474 return EVENT_CONT;
475 }
Target 0: (traffic_server) stopped.
(lldb) p ua_entry->vc
(Http2Stream *) $0 = 0x0000000107808e00
(lldb) p netvc->read_vio_cont()
(Http2ClientSession *) $1 = 0x0000000107809e00
#7586 ?
Thanks, I haven't noticed that. It's fixing the odd do_io_read() call at least. I'll try the change and review it.
Unfortunately, #7586 still crashes 😢
I'm a bit confused. It is crashing the openclose_h2 autest? Or if you include the ssntxnorder_verify.so test plugin and run h2load against it it will crash eventually? I assume the later and the compiler optimizations have eaten the frame headers specifically associated with ssntxnorder_verify.so
Later. I got crash when I run h2load against ATS with the ssntxnorder_verify.so plugin (debug build).
It looks like the Http2ClientSession got EOS on ProxySession::state_api_callout, due to the _vc->do_io_close() is not called before Http2ClientSession::destroy() is called in some path.
Making sure that disabling read/write of VC at Http2ClientSession::destroy() could be a workaround fix. I confirmed below change can fix the crash.
But it looks like this is one of the issues of the closing path is complicated. I'd like to cover this case by #7845.
@@ -72,6 +72,11 @@ Http2ClientSession::Http2ClientSession() : super() {}
void
Http2ClientSession::destroy()
{
+ if (_vc) {
+ _vc->do_io_close();
+ _vc = nullptr;
+ }
+
The workaround is something similar to what I tried to do on #7595, which couldn't pass some of autests. I agree that a complete fix would be something like #7845.
This issue has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.