wasi-libc icon indicating copy to clipboard operation
wasi-libc copied to clipboard

Data race or race condition during exiting/joining a thread

Open alexcrichton opened this issue 2 years ago • 7 comments

This is a program which spawns N threads. Each thread will then, M times, spawn a thread and join the thread. This is intended to exercise concurrently spawning threads.

#include <pthread.h>
#include <stdio.h>
#include <assert.h>

#define N 10
#define M 10000

static void *another_child(void *arg) {
  return arg;
}

static void* child(void *arg) {
  pthread_t child;
  for (int i = 0; i < M; i++) {
    int rc = pthread_create(&child, NULL, another_child, NULL);
    if (rc != 0)
      perror("failed to create thread");
    rc = pthread_join(child, NULL);
    if (rc != 0)
      perror("failed to join thread");
  }
  return arg;
}

int main() {
  pthread_t children[N];

  for (int i = 0; i < N; i++) {
    int rc = pthread_create(&children[i], NULL, child, NULL);
    if (rc != 0)
      perror("failed to spawn thread");
  }

  for (int i = 0; i < N; i++) {
    int rc = pthread_join(children[i], NULL);
    if (rc != 0)
      perror("failed to join thread");
  }
  printf("ok!\n");

}

This is then run with:

$ clang foo.c -o foo.wasm -O2 --target=wasm32-wasi-threads -pthread -Wl,--import-memory,--export-memory,--stack-first,-zstack-size=$((10 << 20)),--max-memory=$((3 << 30))
$ wasmtime run --wasi-modules experimental-wasi-threads --disable-cache --wasm-features threads -- ./foo.wasm

This will produce no output and exit with status zero, but not expectedly. The ok! is not printed at the end of the main function. The reason for this appears to be that this exit is reached because a thread confuses itself as the last remaining thread.

I did a bit of poking to see if this could be fixed. The __tl_lock() function, for example, is aliased to a dummy_0 in many files. I wasn't able to fix it, however.

While looking around in this file, however, I also noticed that free is used to deallocate a thread stack which I don't believe is correct because the thread stack is in use by the thread calling free, which means that after the memory has been deallocated some stack management may still be happening which may modify the free'd memory. I haven't run into this myself, though, and commenting out the free did not fix my issue above.

cc @abrown

alexcrichton avatar Apr 05 '23 15:04 alexcrichton

Ah and hte wasm that is produced looks like this: foo.wasm.gz

alexcrichton avatar Apr 05 '23 15:04 alexcrichton

Interesting point regarding the call to free on thread exit. While the does look like real issue it doesn't look like it related to this example since it only applies to detached thread I think (and non of the threads in this example are detached). Perhaps we should file a different issue for that. Fixing that correctly seems tricky, but I can think of a couple of ways we might be able to make it work.

sbc100 avatar Apr 07 '23 16:04 sbc100

https://github.com/WebAssembly/wasi-libc/pull/409 (only build-tested) can fix this.

yamt avatar Apr 14 '23 12:04 yamt

#409 (only build-tested) can fix this.

i could reproduce the symptom with toywasm and wamr. the patch fixed it for them.

yamt avatar Apr 14 '23 13:04 yamt

I don't believe that this is fixed at this time and I think is coming up again in the discussion of https://github.com/bytecodealliance/wasmtime/issues/8652:

$ $WASI_SDK_PATH/bin/wasm32-wasi-threads-clang foo.c -o foo.wasm -pthread -Wl,--max-memory=$((1<<32))
$ wasmtime run -Sthreads foo.wasm
Error: error while executing at wasm backtrace:
    0: 0x747b - foo.wasm!wasi_thread_start

Caused by:
    0: memory fault at wasm address 0x3162c in linear memory of size 0x20000
    1: wasm trap: out of bounds memory access
Error: error while executing at wasm backtrace:

(with a different error each time)

alexcrichton avatar May 20 '24 18:05 alexcrichton

When I tried adding thread support to rustc in the browser I got weird crashes. I ended up giving up on it as a fair amount of browser devtool bugs made it impossible to debug it, but now that I see this bug, it may have been the case that I had been hitting this bug.

bjorn3 avatar May 20 '24 19:05 bjorn3

I don't believe that this is fixed at this time and I think is coming up again in the discussion of bytecodealliance/wasmtime#8652:

$ $WASI_SDK_PATH/bin/wasm32-wasi-threads-clang foo.c -o foo.wasm -pthread -Wl,--max-memory=$((1<<32))
$ wasmtime run -Sthreads foo.wasm
Error: error while executing at wasm backtrace:
    0: 0x747b - foo.wasm!wasi_thread_start

Caused by:
    0: memory fault at wasm address 0x3162c in linear memory of size 0x20000
    1: wasm trap: out of bounds memory access
Error: error while executing at wasm backtrace:

(with a different error each time)

you still need -Wl,--import-memory -Wl,--export-memory.

yamt avatar May 21 '24 03:05 yamt