Crashes (SIGABRT) Under Test Automation
Environment
Operating System
Intel Mac, x86_64
ProductName: macOS
ProductVersion: 13.4.1
ProductVersionExtra: (c)
BuildVersion: 22F770820d
TinyTDS Version and Information
[TinyTds][v2.1.5][tsql]: /usr/local/bin/tsql
Compile-time settings (established with the "configure" script)
Version: freetds v1.3.20
freetds.conf directory: /usr/local/etc
MS db-lib source compatibility: no
Sybase binary compatibility: yes
Thread safety: yes
iconv library: yes
TDS version: 7.3
iODBC: no
unixodbc: yes
SSPI "trusted" logins: no
Kerberos: yes
OpenSSL: yes
GnuTLS: no
MARS: yes
I'm actually pointing at a post-2.1.5 commit on tiny_tds: 23ed1e4, where #527 was merged.
FreeTDS Version
freetds: 1.3.20
SQL Server
Running against 2017 for Linux in a Docker container via Docker Desktop for Mac. SQL Server Authentication.
Description
In the course of trying to upgrade an application from Ruby 2.5/Rails 5.2.x to Ruby 3.1/Rails 6.1.x, I have noticed a significant increase in automated test failures (we're using rspec-rails). These failures seem take a couple of forms:
1. An EXC_BAD_ACCESS (SIGABRT).
Terminal output:
<big long memory region dump trimmed>
Abort trap: 6
OS Crash Report:
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x7ff8092e61f2 __pthread_kill + 10
1 libsystem_pthread.dylib 0x7ff80931dee6 pthread_kill + 263
2 libsystem_c.dylib 0x7ff809244b45 abort + 123
3 libruby.3.1.dylib 0x106e845e9 die + 9
4 libruby.3.1.dylib 0x106e84810 rb_bug_for_fatal_signal + 544
5 libruby.3.1.dylib 0x106fb475b sigsegv + 91
6 libsystem_platform.dylib 0x7ff80934b5ed _sigtramp + 29
7 ??? 0x0 ???
8 tiny_tds.bundle 0x10c34543e nogvl_dbsqlok + 88 (result.c:130) [inlined]
9 tiny_tds.bundle 0x10c34543e rb_tinytds_result_ok_helper + 109 (result.c:179) [inlined]
10 tiny_tds.bundle 0x10c34543e rb_tinytds_result_exec_helper + 126 (result.c:185)
11 tiny_tds.bundle 0x10c344528 rb_tinytds_result_do + 56 (result.c:528)
12 libruby.3.1.dylib 0x10704864d vm_call_cfunc_with_frame + 349
<snip>
Thread 1:
0 libsystem_kernel.dylib 0x7ff8092e629e poll + 10
1 libruby.3.1.dylib 0x1070004ac timer_pthread_fn + 140
2 libsystem_pthread.dylib 0x7ff80931e1d3 _pthread_start + 125
3 libsystem_pthread.dylib 0x7ff809319bd3 thread_start + 15
Thread 2:: connection_pool.rb:323
0 libsystem_kernel.dylib 0x7ff8092e629e poll + 10
1 libruby.3.1.dylib 0x106ff4acd rb_sigwait_sleep + 557
2 libruby.3.1.dylib 0x106ff5fb4 native_sleep + 500
3 libruby.3.1.dylib 0x106ff6b9b sleep_hrtime + 331
4 libruby.3.1.dylib 0x106f5f475 rb_f_sleep + 85
5 libruby.3.1.dylib 0x10704864d vm_call_cfunc_with_frame + 349
<snip>
Thread 3:: worker-1
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x106fffde5 gvl_acquire_common + 309
3 libruby.3.1.dylib 0x106ff7648 blocking_region_end + 216
4 libruby.3.1.dylib 0x106ff716f rb_nogvl + 175
5 tiny_tds.bundle 0x10c345433 nogvl_dbsqlok + 77 (result.c:129) [inlined]
6 tiny_tds.bundle 0x10c345433 rb_tinytds_result_ok_helper + 98 (result.c:179) [inlined]
7 tiny_tds.bundle 0x10c345433 rb_tinytds_result_exec_helper + 115 (result.c:185)
8 tiny_tds.bundle 0x10c344528 rb_tinytds_result_do + 56 (result.c:528)
9 libruby.3.1.dylib 0x10704864d vm_call_cfunc_with_frame + 349
<snip>
I'm pretty sure this was a simple unit test, given the running threads in the Ruby process. One thing that immediately catches my eye is that there appear to be two threads in the TinyTDS C code at the same time. I'm also curious about the nogvl_* functions, but only b/c of their name. I'm not sure I fully understand their purpose.
2. An EXC_CRASH (SIGABRT).
Terminal output:
/Users/clintmiller/.asdf/installs/ruby/3.1.4/lib/ruby/gems/3.1.0/gems/activerecord-sqlserver-adapter-6.1.3.0/lib/active_record/connection_adapters/sqlserver/database_statements.rb:463: warning: TinyTds: dbsqlsend() returned FAIL.
Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.
OS Crash Report:
Thread 0:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x7ff8092e629e poll + 10
1 libsybdb.5.dylib 0x1101c3ba6 tds_select + 221
2 libsybdb.5.dylib 0x1101c8419 tds_connection_network + 110
3 libsybdb.5.dylib 0x1101c8212 tds_read_packet + 235
4 libsybdb.5.dylib 0x1101afb94 tds_get_byte + 25
5 libsybdb.5.dylib 0x1101a8acf tds_process_tokens + 183
6 libsybdb.5.dylib 0x1101953e5 dbsqlok + 175
7 libruby.3.1.dylib 0x10ad8b158 rb_nogvl + 152
8 tiny_tds.bundle 0x11011c433 nogvl_dbsqlok + 77 (result.c:129) [inlined]
9 tiny_tds.bundle 0x11011c433 rb_tinytds_result_ok_helper + 98 (result.c:179) [inlined]
10 tiny_tds.bundle 0x11011c433 rb_tinytds_result_exec_helper + 115 (result.c:185)
11 tiny_tds.bundle 0x11011b528 rb_tinytds_result_do + 56 (result.c:528)
12 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 1:
0 libsystem_kernel.dylib 0x7ff8092e629e poll + 10
1 libruby.3.1.dylib 0x10ad944ac timer_pthread_fn + 140
2 libsystem_pthread.dylib 0x7ff80931e1d3 _pthread_start + 125
3 libsystem_pthread.dylib 0x7ff809319bd3 thread_start + 15
Thread 2:: connection_pool.rb:323
0 libsystem_kernel.dylib 0x7ff8092e629e poll + 10
1 libruby.3.1.dylib 0x10ad88acd rb_sigwait_sleep + 557
2 libruby.3.1.dylib 0x10ad89fb4 native_sleep + 500
3 libruby.3.1.dylib 0x10ad8ab9b sleep_hrtime + 331
4 libruby.3.1.dylib 0x10acf3475 rb_f_sleep + 85
5 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 3:: Timeout stdlib thread
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92c06 native_cond_sleep + 502
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8a8b9 sleep_forever + 457
5 libruby.3.1.dylib 0x10ad95c17 queue_sleep + 39
6 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
7 libruby.3.1.dylib 0x10ad95b62 queue_do_pop + 226
8 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 4:: server.rb:76
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92c06 native_cond_sleep + 502
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad94276 thread_join_sleep + 230
5 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
6 libruby.3.1.dylib 0x10ad940bd thread_join + 157
7 libruby.3.1.dylib 0x10ad90d6e thread_join_m + 238
8 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 5:: puma reactor
0 libsystem_kernel.dylib 0x7ff8092e41ee kevent + 10
1 nio4r_ext.bundle 0x1191c6c69 kqueue_poll + 281 (ev_kqueue.c:102)
2 nio4r_ext.bundle 0x1191c3f55 ev_backend_poll + 21 (ev.c:4026)
3 libruby.3.1.dylib 0x10ad8b158 rb_nogvl + 152
4 nio4r_ext.bundle 0x1191c33b0 ev_run + 768 (ev.c:4216)
5 nio4r_ext.bundle 0x1191c902a NIO_Selector_run + 86 (selector.c:476) [inlined]
6 nio4r_ext.bundle 0x1191c902a NIO_Selector_select_synchronized + 186 (selector.c:431)
7 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
8 nio4r_ext.bundle 0x1191c8775 NIO_Selector_select + 101 (selector.c:409)
9 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 6:: puma threadpool reaper
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92bd7 native_cond_sleep + 455
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8ab9b sleep_hrtime + 331
5 libruby.3.1.dylib 0x10acf3475 rb_f_sleep + 85
6 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 7:: puma threadpool trimmer
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92bd7 native_cond_sleep + 455
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8ab9b sleep_hrtime + 331
5 libruby.3.1.dylib 0x10acf3475 rb_f_sleep + 85
6 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 8:: puma server
0 libsystem_kernel.dylib 0x7ff8092e8282 __select + 10
1 libruby.3.1.dylib 0x10ad8d8b9 do_select + 425
2 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
3 libruby.3.1.dylib 0x10ad8d5ca rb_thread_fd_select + 1178
4 libruby.3.1.dylib 0x10ac6b5dd select_call + 1293
5 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
6 libruby.3.1.dylib 0x10ac5eb6c rb_f_select + 220
7 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 9:: worker.rb:96
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92c06 native_cond_sleep + 502
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8a8b9 sleep_forever + 457
5 libruby.3.1.dylib 0x10ad95c17 queue_sleep + 39
6 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
7 libruby.3.1.dylib 0x10ad95b62 queue_do_pop + 226
8 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 10:: worker-1
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92c06 native_cond_sleep + 502
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8a8b9 sleep_forever + 457
5 libruby.3.1.dylib 0x10ad95c17 queue_sleep + 39
6 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
7 libruby.3.1.dylib 0x10ad95b62 queue_do_pop + 226
8 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 11:: worker-2
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92c06 native_cond_sleep + 502
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8a8b9 sleep_forever + 457
5 libruby.3.1.dylib 0x10ad95c17 queue_sleep + 39
6 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
7 libruby.3.1.dylib 0x10ad95b62 queue_do_pop + 226
8 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 12:: worker-1
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92c06 native_cond_sleep + 502
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8a8b9 sleep_forever + 457
5 libruby.3.1.dylib 0x10ad95c17 queue_sleep + 39
6 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
7 libruby.3.1.dylib 0x10ad95b62 queue_do_pop + 226
8 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 13:: worker-1
0 libsystem_kernel.dylib 0x7ff8092e20ee __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x7ff80931e758 _pthread_cond_wait + 1242
2 libruby.3.1.dylib 0x10ad92c06 native_cond_sleep + 502
3 libruby.3.1.dylib 0x10ad89e30 native_sleep + 112
4 libruby.3.1.dylib 0x10ad8a8b9 sleep_forever + 457
5 libruby.3.1.dylib 0x10ad95c17 queue_sleep + 39
6 libruby.3.1.dylib 0x10ac24ece rb_ensure + 350
7 libruby.3.1.dylib 0x10ad95b62 queue_do_pop + 226
8 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 14 Crashed:: worker-2
0 libsystem_kernel.dylib 0x7ff8092e61f2 __pthread_kill + 10
1 libsystem_pthread.dylib 0x7ff80931dee6 pthread_kill + 263
2 libsystem_c.dylib 0x7ff809244b45 abort + 123
3 libsystem_c.dylib 0x7ff809243e5e __assert_rtn + 314
4 libsybdb.5.dylib 0x1101ce32f tds_free_connection.cold.1 + 35
5 libsybdb.5.dylib 0x1101a6509 tds_free_connection + 368
6 libsybdb.5.dylib 0x1101a671f tds_free_socket + 215
7 libsybdb.5.dylib 0x110194a27 dbclose + 239
8 tiny_tds.bundle 0x1101197af rb_tinytds_close + 63 (client.c:266)
9 libruby.3.1.dylib 0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Obviously, this case with more threads is a system/integration test that started a webserver.
Other Notes
It's interesting to note, that the application boots and runs in my development environment w/o any issues. I'm not sure how it will do with a production workload given the crashes I'm seeing in our test suite.
Next Steps
Making a minimal reproduction test-case would be useful, but I'm not quite sure where to start. On Ruby 2.5/Rails 5.2, our test suite rarely (if ever), crashed in this manner. Is there any other additional information I can provide or debug builds of TinyTDS I could use to reveal more helpful information?
Hi, couple of questions:
- Against which SQL server do you run the test suite? What kind of of authentication do you run?
- The crash reports seem from a Mac system. With test automation, do you mean just the test suite of your project or do you run automated tests against Mac on a continuous integration system, like GitHub Actions?
- If you update your
tiny_tdsto the mentioned commit23ed1e4before you did the Ruby and Rails upgrade, can you reproduce these issues? - Do you run ARM or Intel on your Mac?
Updated above w/ answers- also copied here for convenience, thank you @andyundso.
Against which SQL server do you run the test suite? What kind of of authentication do you run?
Running against 2017 for Linux in a Docker container via Docker Desktop for Mac. SQL Server Authentication. I don't think I'm having an authentication problem as my app (and tests) are able to connect to the database and execute queries.
The crash reports seem from a Mac system. With test automation, do you mean just the test suite of your project or do you run automated tests against Mac on a continuous integration system, like GitHub Actions?
By "test automation", I simply meant the execution of the test suite for this project. As part of this upgrade process, I will get the suite running in an automated CI system, either w/ updates to our existing, but dated Jenkins CI installation, or in GitHub Actions. I have not done this yet.
If you update your tiny_tds to the mentioned commit 23ed1e4 before you did the Ruby and Rails upgrade, can you reproduce these issues?
I will try this against our Rails 5.2 branch, assuming the newer TinyTDS is backwards compatible both w/ Rails 5.2 (should be, I think), and Ruby 2.5.x (??).
Do you run ARM or Intel on your Mac?
This is on an Intel Mac.
Disclaimer: I'm unfamiliar with the C code in tiny_tds, just trying to help out :).
I also own an Intel Mac, the same OS version as yours, the same FreeTDS version. I maintain two Rails projects, both with RSpec, but one runs Ruby 3.0, the other 3.1. I checked if I also ran into these issues when using the version of tiny_tds you mentioned in your issue and the SQL server 2017 in Docker. I have run about ten runs on both apps so far with no problems. Are your crashes more infrequent than every tenth run?
In general, from what I read in the code, the nogvl stuff executes code on a separate Ruby thread but releases the Global VM Lock, as we no longer execute Ruby code, but instead talk with FreeTDS, who then talks to MSSQL. Once the function finishes, it re-acquires the GVL.
EXC_BAD_ACCESS means we write memory somewhere where we should not. EXC_CRASH reacts to a SIGABRT, likely raised of freetds because of the assertion error that it hits.
Is there some consistency when you hit these errors? For example, I assume you run your test suite in random order. Once it crashes and you re-run the test suite with the same seed, does it crash again with the same error?
As mentioned, it would be interesting to know if you also hit this error when updating tiny_tds, but run the Ruby 2.5 / Rails 5.2 version of your app. tiny_tds should work with Ruby 2.5, at least our CI is green :)
similar error when in rails console on Ruby 3.1.2, Rails 7.0, TinyTDS 2.1.5, freetds v1.3.20 on M1 mac running against various MS Sql Server servers e.g. Microsoft SQL Server 2016 (SP3-CU1-GDR) (KB5021128) - 13.0.7024.30 (X64)
usually after I've made a bunch of queries and have had the console open a while, but only on exit
Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.
Abort trap: 6
I was able to reliably trigger this with a single spec example and it's beginning to form some type of working theory for me. This is an excerpt from a Rails controller spec:
describe 'GET finish' do
let(:product_download) { create(:product_download) }
let(:params) do
{ membership: product_download.membership_id, log: product_download.id, bytes: 123456 }
end
it 'closes the product download log log' do
get :finish, params: params
updated_log = ProductDownload.find(product_download.id) # <= This is the boom!
...
end
end
Test output:
Igniter::DownloadController
GET finish
/Users/clintmiller/.asdf/installs/ruby/3.1.4/lib/ruby/gems/3.1.0/gems/activerecord-sqlserver-adapter-6.1.3.0/lib/active_record/connection_adapters/sqlserver/database_statements.rb:463: warning: TinyTds: dbsqlsend() returned FAIL.
Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.
Abort trap: 6
One other important detail is that my application is a multi-tenant app, where the current tenant is a per-request concept that can be switched based on criteria like the hostname requested. One of the things that tenant switching did* is switch the database connection with the Apartment gem, which we had* forked to support the activerecord-sqlserver-adapter.
Something must be happening lifetime of these database connections when this spec is run. The database connection being used by the Rails server handling the get :finish request is closed (the request is over after all?) but is then attempted to be used again by the following line? I'm not sure that's exactly it.
*Did and *had b/c we've actually removed Apartment and the tenant switching functionality that used it to change the database connections. The tenancy concept is still in place, and requests still switch tenants as before, but we aren't manipulating the database connections during the tenant switch any longer.
@clintmiller have you tried reproducing bundling against main? there may be an unreleased fix