slim
slim copied to clipboard
Bus Error on M1 mac
slim runs successfully but outputs
[1] 45373 bus error /Users/sano/work/slim/bin/slim
in the last line on Sano's M1 mac.
need investigation on other M1 macs: tbd.
実行終了後にメモリを解放する段階で bus error が起きているようだ.
とりあえず,ここはうまくいっていない: https://github.com/lmntal/slim/blob/1052925f8d3f1e8eace14cd42eaf82fb9b469df1/src/vm/ccallback.cpp#L61
が,この(これとこの次の)2行をコメントアウトしても,
slim(28531,0x104700580) malloc: *** error for object 0x1045ec000: pointer being freed was not allocated
slim(28531,0x104700580) malloc: *** set a breakpoint in malloc_error_break to debug
>>>> lmn_profiler_finalize();>>>> destroy_rules();>>>> delete lmn_functor_table;[1] 28531 abort /Users/sano/work/lmntal/slim/bin/slim test5.il
となったので,まだ調査が必要.
暫定的な解決法
main.cpp の slim_finalize();
をコメントアウトすれば(メモリリークを気にしなければ)問題なく動く.
https://github.com/lmntal/slim/blob/c3c95f9ae2c0de5e9e9eb119caafc53ee559f716/src/main.cpp#L820
これは関係するだろうか? https://github.com/ffi/ffi/issues/864
使用: Apple clang version 14.0.0 (clang-1400.0.29.202)
デバッガ(lldb)によると,
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1
frame #0: 0x00000001000d6498 slim`st_foreach(table=0x000000010100a960, func=(slim`CCallback::free_v(unsigned long,
unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:517:26 [opt]
514 enum st_retval retval;
515 int i;
516
-> 517 for (i = 0; i < table->num_bins; i++) {
518 last = 0;
519 for (ptr = table->bins[i];
520 ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
Target 0: (slim) stopped.
(lldb) s
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000d64cc slim`st_foreach(table=0x000000010100a960, func=(slim`CCallback::free_v(unsigned long,
unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:519:23 [opt]
516
517 for (i = 0; i < table->num_bins; i++) {
518 last = 0;
-> 519 for (ptr = table->bins[i];
520 ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
521 retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
522 switch (retval) {
Target 0: (slim) stopped.
(lldb)
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000d64f8 slim`st_foreach(table=0x000000010100a960, func=(slim`CCallback::free_v(unsigned long,
unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:521:45 [opt]
518 last = 0;
519 for (ptr = table->bins[i];
520 ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
-> 521 retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
522 switch (retval) {
523 case ST_CHECK: /* check if hash is modified during iteration */
524 tmp = 0;
Target 0: (slim) stopped.
(lldb)
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000771f4 slim`CCallback::free_v(key=<unavailable>, v=4295455220, _t=<unavailable>) at ccallback
.cpp:64:3 [opt]
61 }
62
63 int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
-> 64 delete (CCallback *)v;
65 return ST_CONTINUE;
66 }
67 /* コールバックを名前nameで登録する。arityはコールバックに引数として
Target 0: (slim) stopped.
(lldb)
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000cb050 slim`operator delete(p=0x00000001000771f4) at alloc.cpp:169:3 [opt]
166 }
167
168 void operator delete(void* p) noexcept {
-> 169 lmn_free(p);
170 }
171
172 void operator delete[](void* p) noexcept {
Target 0: (slim) stopped.
(lldb)
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000cb014 slim`lmn_free(p=0x00000001000771f4) at alloc.cpp:158:26 [opt]
155 return result;
156 }
157
-> 158 void lmn_free(void *p) { free(p); }
159
160 void* operator new(std::size_t num) {
161 return lmn_malloc(num);
Target 0: (slim) stopped.
(lldb)
slim(77010,0x1f9833a80) malloc: *** error for object 0x1000771f4: pointer being freed was not allocated
slim(77010,0x1f9833a80) malloc: *** set a breakpoint in malloc_error_break to debug
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
frame #0: 0x000000019e4961b0 libsystem_kernel.dylib`__pthread_kill + 8
libsystem_kernel.dylib`:
-> 0x19e4961b0 <+8>: b.lo 0x19e4961d0 ; <+40>
0x19e4961b4 <+12>: pacibsp
0x19e4961b8 <+16>: stp x29, x30, [sp, #-0x10]!
0x19e4961bc <+20>: mov x29, sp
Target 0: (slim) stopped.
(lldb)
Process 77010 exited with status = 6 (0x00000006) Terminated due to signal 6
(lldb)
ccallback_tbl
中のなんらかの要素に対してfreeしようとして,これはallocしてないぞと怒られている.
使用: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 intelマシンでvalgrindを回してみる.
command: valgrind --tool=memcheck --leak-check=yes --show-reachable=yes
result:
==954898== Memcheck, a memory error detector
==954898== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==954898== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==954898== Command: ../lmntal/installed/bin/slim demo.il
==954898==
a(b).
==954898==
==954898== HEAP SUMMARY:
==954898== in use at exit: 224 bytes in 2 blocks
==954898== total heap usage: 1,495 allocs, 1,493 frees, 252,594 bytes allocated
==954898==
==954898== 96 bytes in 1 blocks are still reachable in loss record 1 of 2
==954898== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==954898== by 0x4A0C990: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898== by 0x4A0B212: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898== by 0x4A0B2E4: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898== by 0x400647D: call_init.part.0 (dl-init.c:70)
==954898== by 0x4006567: call_init (dl-init.c:33)
==954898== by 0x4006567: _dl_init (dl-init.c:117)
==954898== by 0x40202C9: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
==954898== by 0x1: ???
==954898== by 0x1FFF0004D6: ???
==954898== by 0x1FFF0004F3: ???
==954898==
==954898== 128 bytes in 1 blocks are still reachable in loss record 2 of 2
==954898== at 0x484DCD3: realloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==954898== by 0x4A0C9B8: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898== by 0x4A21C54: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898== by 0x4A0B2C8: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898== by 0x400647D: call_init.part.0 (dl-init.c:70)
==954898== by 0x4006567: call_init (dl-init.c:33)
==954898== by 0x4006567: _dl_init (dl-init.c:117)
==954898== by 0x40202C9: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
==954898== by 0x1: ???
==954898== by 0x1FFF0004D6: ???
==954898== by 0x1FFF0004F3: ???
==954898==
==954898== LEAK SUMMARY:
==954898== definitely lost: 0 bytes in 0 blocks
==954898== indirectly lost: 0 bytes in 0 blocks
==954898== possibly lost: 0 bytes in 0 blocks
==954898== still reachable: 224 bytes in 2 blocks
==954898== suppressed: 0 bytes in 0 blocks
==954898==
==954898== For lists of detected and suppressed errors, rerun with: -s
==954898== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
still reachableではあるが,2箇所でメモリリークが起きているのがわかる. (追記)これはそんなに関係ない気がしてきた.
alloc,freeしているアドレスを全部printしてみたところ,最後にfreeに失敗しているアドレスは,
- 一度もallocされていない
- 二重解放では無い
- 他の全てのアドレスは末尾が0だが,これだけ末尾が0ではない.
- これはintel側では末尾が0ではないアドレスを解放しようとしたりはしていなかった.つまり,M1固有の問題?
使用: M1: Apple clang version 14.0.0 (clang-1400.0.29.202) intel: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
デバッガ M1:
Process 37362 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000d64d0 slim`st_foreach(table=0x0000000101206220, func=(slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:521:45 [opt]
518 last = 0;
519 for (ptr = table->bins[i];
520 ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
-> 521 retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
522 switch (retval) {
523 case ST_CHECK: /* check if hash is modified during iteration */
524 tmp = 0;
Target 0: (slim) stopped.
(lldb) v
(st_table_t) table = 0x0000000101206220
(int (*)(...)) func = 0x0000000100077158 (slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:64:3)
(st_data_t) arg = 0
(int) i = 0
(st_table_entry *) ptr = 0x0000000100707e20
(st_table_entry *) last = nullptr
(st_table_entry *) tmp = <variable not available>
(st_retval) retval = <variable not available>
.
.
.
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x0000000100077158 slim`CCallback::free_v(key=<unavailable>, v=4295455064, _t=<unavailable>) at ccallback.cpp:64:3 [opt]
61 }
62
63 int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
-> 64 delete (CCallback *)v;
65 return ST_CONTINUE;
66 }
67 /* コールバックを名前nameで登録する。arityはコールバックに引数として
Target 0: (slim) stopped.
(lldb) v
(st_data_t) key = <no location, value may have been optimized out>
(st_data_t) v = 4295455064
(st_data_t) _t = <no location, value may have been optimized out>
Process 37340 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000cb028 slim`operator delete(p=0x0000000100077158) at alloc.cpp:176:3 [opt]
173 }
174
175 void operator delete(void* p) noexcept {
-> 176 lmn_free(p);
177 }
178
179 void operator delete[](void* p) noexcept {
Target 0: (slim) stopped.
(lldb) v
(void *) p = 0x0000000100077158
intel:
(gdb) s521 retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
(gdb) p func
$6 = (int (*)(void)) 0x5555555afab8 <CCallback::free_v(unsigned long, unsigned long, unsigned long)>
(gdb) p ptr->key
$7 = 88
(gdb) p ptr->record
$8 = 93824993358816
(gdb) p arg
$9 = 0
(gdb) p ptr
$10 = (st_table_entry *) 0x5555556709c0
(gdb)
.
.
.
CCallback::free_v (key=88, v=93824993358816, _t=0) at ccallback.cpp:65
65 int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
(gdb) p v
$4 = 93824993358816
(gdb) s
66 delete (CCallback *)v;
(gdb) p v
$5 = 93824993358816
operator delete (p=0x5555556673e0) at alloc.cpp:172
172 lmn_free(p);
のように,delete (CCallback *)v;
の直前までは挙動が同じで,(CCallback *)v
の結果で差が出ているように思われる.
deleteしたpに関して,M1では未allocのアドレスが出てきたが,intelではalloc済みのpが出てきた.
(CCallback *)v
の値に関して,M1ではfuncのアドレスが出てきて,intelでは前にallocしたどこかのアドレスが出てきた.
vの値がおかしい?
vの値を追う.
st_foreach時,
intel:
st_foreach中:
Breakpoint 4, st_foreach(st_table*, int (*)(...), unsigned long) (table=0x55555566de50, func=func@entry=0x5555555afab8 <CCallback::free_v(unsigned long, unsigned long, unsigned long)>, arg=arg@entry=0) at st.cpp:512
512 int st_foreach(st_table_t table, int (*func)(ANYARGS), st_data_t arg) {
(gdb) s
517 for (i = 0; i < table->num_bins; i++) {
(gdb) s
519 for (ptr = table->bins[i];
(gdb) s
520 ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
(gdb) s
521 retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
(gdb) p ptr->key
$26 = 88
(gdb) p ptr->record
$27 = 93824993358816
(gdb) p ptr
$28 = (st_table_entry *) 0x5555556709c0
(v=93824993358816) key=88の要素のinsert時:
Breakpoint 3, CCallback::lmn_register_c_fun (name=name@entry=0x55555560feba “cb_make_input_string”, f=f@entry=0x5555555f5cee <cb_make_input_string(LmnReactCxt*, LmnMembrane*, void*, unsigned char, void*, unsigned char)>, arity=arity@entry=2) at ccallback.cpp:71
71 void CCallback::lmn_register_c_fun(const char *name, void *f, int arity) {
(gdb)
Continuing.
malloc: 0x5555556673e0
malloc: 0x555555670960
malloc: 0x555555670990
Breakpoint 1, st_insert (table=0x55555566de50, key=88, value=value@entry=93824993358816) at st.cpp:305
305 int st_insert(st_table_t table, st_data_t key, st_data_t value) {
(gdb)
Continuing.
valueは間違いない.直後のfree_v呼び出しでも,
CCallback::free_v (key=88, v=93824993358816, _t=0) at ccallback.cpp:65
65 int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
(gdb) s
66 delete (CCallback *)v;
(gdb) p v
$29 = 93824993358816
当然元のvalue=93824993358816が渡っている.
一方,M1では,
st_foreach中:
Process 10945 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000d64d8 slim`st_foreach(table=0x0000000101005140, func=(slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:65), arg=0)(...), unsigned long) at st.cpp:522:45 [opt]
519
520 for (i = 0; i < table->num_bins; i++) {
521 last = 0;
-> 522 for (ptr = table->bins[i];
523 ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
524 retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
525 switch (retval) {
Target 0: (slim) stopped.
(lldb) v
(st_table_t) table = 0x0000000101005140
(int (*)(...)) func = 0x0000000100077188 (slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:65)
(st_data_t) arg = 0
(int) i = 0
(st_table_entry *) ptr = 0x0000000101404660
(st_table_entry *) last = nullptr
(st_table_entry *) tmp = <variable not available>
(st_retval) retval = <variable not available>
(lldb) v ptr->record
(st_data_t) ptr->record = 4302336288
(lldb) v ptr->key
(st_data_t) ptr->key = 88
(v=4302336288) key=88要素のinsertは,
* thread #1, queue = ‘com.apple.main-thread’, stop reason = breakpoint 2.1
frame #0: 0x00000001000771e8 slim`CCallback::lmn_register_c_fun(name=“cb_make_input_string”, f=0x00000001000d4820, arity=2) at ccallback.cpp:76:18 [opt]
73 /* コールバックを名前nameで登録する。arityはコールバックに引数として
74 渡されるアトムのリンク数 */
75 void CCallback::lmn_register_c_fun(const char *name, void *f, int arity) {
-> 76 CCallback *c = new CCallback;
77 c->f = f;
78 c->arity = arity;
79 st_insert(ccallback_tbl, (st_data_t)lmn_intern(name), (st_data_t)c);
Target 0: (slim) stopped.
(lldb)
Process 10945 resuming
Process 10945 stopped
* thread #1, queue = ‘com.apple.main-thread’, stop reason = breakpoint 3.1
frame #0: 0x00000001000d5d74 slim`st_insert(table=0x0000000101005140, key=88, value=4302336288) at st.cpp:306:13 [opt]
303 /* ハッシュ表に新たなエントリーを追加する.
304 * エントリが存在した場合は, エントリの値のみを更新し,
305 * キーは元々のものを更新しない. エントリが存在しなかった場合に0,
-> 306 * エントリが存在した場合には1以上の整数を返す. */
307 int st_insert(st_table_t table, st_data_t key, st_data_t value) {
308 int ret = st_insert_inner(table, key, value);
309 if (!ret && (table->num_entries / table->num_bins) > ST_DEFAULT_MAX_DENSITY) {
Target 0: (slim) stopped.
ここまでは問題ない.しかし,次のfree_v呼び出し時では,
Process 10945 stopped* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x00000001000771ac slim`CCallback::free_v(key=<unavailable>, v=4295455112, _t=<unavailable>) at ccallback.cpp:70:3 [opt]
67 //printf("v: %p\n", (CCallback *)v);
68 //(CCallback *)vをprint
69 printf("v(cast): %p\n", (CCallback *)v);
-> 70 delete (CCallback *)v;
71 return ST_CONTINUE;
72 }
73 /* コールバックを名前nameで登録する。arityはコールバックに引数として
Target 0: (slim) stopped.
(lldb) v
(st_data_t) key = <no location, value may have been optimized out>
(st_data_t) v = 4295455112
(st_data_t) _t = <no location, value may have been optimized out>
vが違う値(v=4295455112)になっている.vに関しては見えなくなっている.
<no location, value may have been optimized out>
: https://github.com/bytecodealliance/wasmtime/issues/3884
【clang】 MacBook Pro (M1, 2020, Sonoma 14.0, clang 15.0) でも bus error 現象を確認しました.(bison, ruby, [email protected] は brew で別途導入) また,test/system_check/testsuite/hyperlink の case5 は,bus error の前に segmentation fault します. (参考)Ubuntu22.04 PCで clang++-15 (+ libstdc++-12) でビルドしたら make check 全部通りました.
【gnu gcc】 M1 Mac でも,gcc-13 ならば make check 全部通りました.ただ,できた src/slim バイナリで .il ファイルは実行できるけど,.lmn を与えてコンパイル+実行しようとすると segmentation fault します.要調査. gcc-12 では,make check が,上記の case5 だけ FAIL して,調べてみるとこの例題だけ segmentation fault しています. bus error とは別に,case5.lmn の内部挙動を調査する必要があるかもしれません.
#281 に関しては,(Ubuntu22.04, clang++-15)ではbus errorは出ないということならば,clang(llvm)は無関係で,M1固有の問題の可能性が高いということですね,ありがとうございます.
#294 に関しては,#281 とは独立しているということで,別途調査します.