trafficserver
trafficserver copied to clipboard
quiche CI build failing
It's been failing for quite some time now, this is the list of failing tests.
- [ ] active_timeout (@brbzull0 - works locally(using quictls, ubuntu 20.04)
- [ ] ja3_fingerprint
- [ ] proxy_protocol
- [ ] sigusr2
- [ ] stek_share
- [ ] tls_0rtt_server
- [ ] tls_client_verify3
- [ ] tls_sni_ip_allow
- [ ] tls_sni_with_port
- [ ] tls_ticket
- [ ] tls_tunnel
- [ ] tls_tunnel_forward
- [ ] tunnel_transform
- [ ] txn_type (@brbzull0 - works locally(using quictls, ubuntu 20.04) - fail with bssl.
- [ ] via (https://github.com/apache/trafficserver/issues/9783)
I will be trying them all locally and on the ci docker till we sort them all out. If you want to pick any autests and try it out, please add the notes on the side of the test name.
active_timeout
there seems to be a crash un rockylinux, could be related to some library version, etc:
[Jan 22 15:35:32.154] [ET_NET 10] DIAG: <QUICNetVConnection.cc:373 (handle_received_packet)> (v_quic_net) [] failed to process packet: -9
[Jan 22 15:35:32.154] [ET_NET 10] DIAG: <QUICNetVConnection.cc:664 (_handle_interval)> (quic_net) [] is_app=0 error_code=10 reason=
[Jan 22 15:35:32.154] [ET_NET 11] DIAG: <QUICNetVConnection.cc:373 (handle_received_packet)> (v_quic_net) [] failed to process packet: -9
[Jan 22 15:35:32.155] [ET_NET 11] DIAG: <QUICNetVConnection.cc:664 (_handle_interval)> (quic_net) [] is_app=0 error_code=10 reason=
[Jan 22 15:35:33.982] [ET_NET 10] DIAG: <QUICNetVConnection.cc:118 (free_thread)> (quic_net) [] Free connection
[Jan 22 15:35:33.989] [ET_NET 11] DIAG: <QUICNetVConnection.cc:118 (free_thread)> (quic_net) [] Free connection
[Jan 22 15:35:34.056] [ET_UDP 0] DIAG: <QUICNetVConnection.cc:102 (destroy)> (quic_net) [] Destroy connection
[Jan 22 15:35:34.056] [ET_UDP 0] DIAG: <QUICNetVConnection.cc:102 (destroy)> (quic_net) [] Destroy connection
[Jan 22 15:35:40.264] [ET_UDP 0] DIAG: <QUICPacketHandler.cc:267 (_recv_packet)> (quic_sec) [e91af5e3-224fe664] client initial dcid=0x224fe66471f09d1fbc1fc9611a6fbc891ff7
[Jan 22 15:35:40.265] [ET_UDP 0] DIAG: <QUICPacketHandler.cc:267 (_recv_packet)> (quic_sec) [e91af5e3-224fe664] client initial dcid=0x224fe66471f09d1fbc1fc9611a6fbc891ff7
[Jan 22 15:35:40.265] [ET_NET 0] DIAG: <QUICNextProtocolAccept.cc:53 (mainEvent)> (v_quic) [] event 202 netvc 0x7f1761243220
[Jan 22 15:35:40.265] [ET_NET 1] DIAG: <QUICNextProtocolAccept.cc:53 (mainEvent)> (v_quic) [] event 202 netvc 0x7f1761242bd0
[Jan 22 15:35:40.266] [ET_NET 0] DIAG: <QUICNetVConnection.cc:373 (handle_received_packet)> (v_quic_net) [] failed to process packet: -9
[Jan 22 15:35:40.266] [ET_NET 0] DIAG: <QUICNetVConnection.cc:664 (_handle_interval)> (quic_net) [] is_app=0 error_code=10 reason=
[Jan 22 15:35:40.266] [ET_NET 1] DIAG: <QUICNetVConnection.cc:373 (handle_received_packet)> (v_quic_net) [] failed to process packet: -9
[Jan 22 15:35:40.266] [ET_NET 1] DIAG: <QUICNetVConnection.cc:664 (_handle_interval)> (quic_net) [] is_app=0 error_code=10 reason=
[Jan 22 15:35:41.972] [ET_NET 0] DIAG: <QUICNetVConnection.cc:118 (free_thread)> (quic_net) [] Free connection
[Jan 22 15:35:42.003] [ET_NET 1] DIAG: <QUICNetVConnection.cc:118 (free_thread)> (quic_net) [] Free connection
[Jan 22 15:35:42.068] [ET_UDP 0] DIAG: <QUICNetVConnection.cc:102 (destroy)> (quic_net) [] Destroy connection
[Jan 22 15:35:42.068] [ET_UDP 0] DIAG: <QUICNetVConnection.cc:102 (destroy)> (quic_net) [] Destroy connection
traffic_server: received signal 11 (Segmentation fault)
traffic_server - STACK TRACE:
/tmp/sandbox/active_timeout/ts/bin/traffic_server(_Z19crash_logger_invokeiP9siginfo_tPv+0xce)[0x8dc520]
/lib64/libpthread.so.0(+0x12cf0)[0x7f178a07ccf0]
/lib64/libc.so.6(cfree+0x21)[0x7f1789d40e31]
/tmp/sandbox/active_timeout/ts/bin/traffic_server(_Z8ats_freePv+0x28)[0x91700f]
/tmp/sandbox/active_timeout/ts/bin/traffic_server(_ZN16HttpConfigParamsD1Ev+0x2c)[0x8e4a1a]
/lib64/libc.so.6(+0x5126c)[0x7f1789cf626c]
/lib64/libc.so.6(on_exit+0x0)[0x7f1789cf63a0]
/lib64/libc.so.6(__libc_start_main+0xec)[0x7f1789cdfd8c]
/tmp/sandbox/active_timeout/ts/bin/traffic_server(_start+0x2e)[0x8dba7e]
txn_type
( and some others ) seems to be related to bssl.
rocky8:
#0 __libc_write (nbytes=218, buf=0x7f1ff4029788, fd=54) at ../sysdeps/unix/sysv/linux/write.c:26
#1 __libc_write (fd=54, buf=0x7f1ff4029788, nbytes=218) at ../sysdeps/unix/sysv/linux/write.c:24
#2 0x0000000000b07734 in SocketManager::write (fd=54, buf=0x7f1ff4029788, size=218) at /home/jenkins/trafficserver/src/iocore/dns/../eventsystem/P_UnixSocketManager.h:142
#3 0x0000000000c6e65e in fastopen_bwrite (bio=0x7f1ff4029ef8, in=0x7f1ff4029788 "\026\003\001", insz=218) at /home/jenkins/trafficserver/src/iocore/net/BIO_fastopen.cc:131
#4 0x00007f2029ded799 in BIO_write (bio=0x7f1ff4029ef8, in=<optimized out>, inl=<optimized out>) at /root/build_h3_tools/boringssl/crypto/bio/bio.c:173
#5 0x00007f202a1bf789 in bssl::tls_flush_flight (ssl=0x7f1ff402afe8) at /root/build_h3_tools/boringssl/ssl/s3_both.cc:333
#6 0x00007f202a1b6a3a in bssl::ssl_run_handshake (hs=0x7f1ff402fd48, out_early_return=out_early_return@entry=0x7f202560266f) at /root/build_h3_tools/boringssl/ssl/handshake.cc:602
#7 0x00007f202a1d028a in SSL_do_handshake (ssl=0x7f1ff402afe8) at /root/build_h3_tools/boringssl/ssl/ssl_lib.cc:865
#8 SSL_do_handshake (ssl=0x7f1ff402afe8) at /root/build_h3_tools/boringssl/ssl/ssl_lib.cc:849
#9 0x0000000000c13f5e in SSLNetVConnection::_ssl_connect (this=0x7f202b0d83c0) at /home/jenkins/trafficserver/src/iocore/net/SSLNetVConnection.cc:2462
#10 0x0000000000c1155e in SSLNetVConnection::sslClientHandShakeEvent (this=0x7f202b0d83c0, err=@0x7f2025602fe8: 0) at /home/jenkins/trafficserver/src/iocore/net/SSLNetVConnection.cc:1608
#11 0x0000000000c105e1 in SSLNetVConnection::sslStartHandShake (this=0x7f202b0d83c0, event=1, err=@0x7f2025602fe8: 0) at /home/jenkins/trafficserver/src/iocore/net/SSLNetVConnection.cc:1272
#12 0x0000000000c0d548 in SSLNetVConnection::net_read_io (this=0x7f202b0d83c0, nh=0x7f202675aba0, lthread=0x7f202675a010) at /home/jenkins/trafficserver/src/iocore/net/SSLNetVConnection.cc:611
#13 0x0000000000c8c3da in NetHandler::process_ready_list (this=0x7f202675aba0) at /home/jenkins/trafficserver/src/iocore/net/NetHandler.cc:276
#14 0x0000000000c8c733 in NetHandler::waitForActivity (this=0x7f202675aba0, timeout=10000000) at /home/jenkins/trafficserver/src/iocore/net/NetHandler.cc:364
#15 0x0000000000cc6fd9 in EThread::execute_regular (this=0x7f202675a010) at /home/jenkins/trafficserver/src/iocore/eventsystem/UnixEThread.cc:299
#16 0x0000000000cc716f in EThread::execute (this=0x7f202675a010) at /home/jenkins/trafficserver/src/iocore/eventsystem/UnixEThread.cc:348
#17 0x0000000000cc5af1 in spawn_thread_internal (a=0x205beb0) at /home/jenkins/trafficserver/src/iocore/eventsystem/Thread.cc:68
#18 0x00007f20283e41ca in start_thread (arg=<optimized out>) at pthread_create.c:479
#19 0x00007f2028050e73 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
stek_share:
[Mar 20 14:07:18.869] traffic_server NOTE: plugin.config loading ...
[Mar 20 14:07:18.869] traffic_server NOTE: loading plugin '/tmp/sandbox/stek_share/ts1/plugin/stek_share.so'
[Mar 20 14:07:18.873] traffic_server ERROR: unable to load '/tmp/sandbox/stek_share/ts1/plugin/stek_share.so': /opt/lib/libnuraft.so: undefined symbol: SSL_ctrl
[Mar 20 14:07:18.873] traffic_server FATAL: unable to load '/tmp/sandbox/stek_share/ts1/plugin/stek_share.so': /opt/lib/libnuraft.so: undefined symbol: SSL_ctrl
Could be an issue in the way we build nuraft
I did try that, using the same CI commands to build it and found the same error. Weird this does not happens in other CI's
stek_share:
[Mar 20 14:07:18.869] traffic_server NOTE: plugin.config loading ... [Mar 20 14:07:18.869] traffic_server NOTE: loading plugin '/tmp/sandbox/stek_share/ts1/plugin/stek_share.so' [Mar 20 14:07:18.873] traffic_server ERROR: unable to load '/tmp/sandbox/stek_share/ts1/plugin/stek_share.so': /opt/lib/libnuraft.so: undefined symbol: SSL_ctrl [Mar 20 14:07:18.873] traffic_server FATAL: unable to load '/tmp/sandbox/stek_share/ts1/plugin/stek_share.so': /opt/lib/libnuraft.so: undefined symbol: SSL_ctrl
Could be an issue in the way we build
nuraft
I did try that, using the same CI commands to build it and found the same error. Weird this does not happens in other CI's
Yeah, I'm tinkering around to see I can get a nuraft build with boringssl, getting some build errors for now.
I have a fix for the stek_share
plugin now, ~I'll reference a PR once that's ready.~ #11185
I have a fix for the
stek_share
plugin now, ~I'll reference a PR once that's ready.~ #11185
Latest master quiche autests are passing now: https://ci.trafficserver.apache.org/view/master/job/master/job/quiche/315/consoleFull
Running Test stek_share:.. Passed
The cause of 0rtt test failure is probably https://github.com/apache/trafficserver/issues/11203
Closing this issue. All fixed now. Thanks to everyone!