zig icon indicating copy to clipboard operation
zig copied to clipboard

WASI std lib tests fail when the host is Windows (mostly `fs`-related)

Open squeek502 opened this issue 2 years ago • 2 comments

Zig Version

0.10.0-dev.4281+ebeadf9d9

Steps to Reproduce

These tests pass when the host is Linux, but fail when the host is Windows. 6 tests are failing within std/fs/test.zig and 10 are failing across the entire std lib test suite. Not sure exactly what the other 4 are. I'm testing with wasmtime-v1.0.1-x86_64-windows.

Worth noting especially that Dir.deleteTree and Dir.deleteTreeMinStackSize are consistently failing with error.AccessDenied, so all the fs tests that use testing.TmpDir are failing to clean up their directories too.

It's unclear to me exactly where solutions should happen here: are these Zig bugs, wasmtime bugs, or problems with the WASI spec?

Expected Behavior

All tests pass

Actual Behavior

When running just std/fs/test.zig:

>zig test lib/std/fs/test.zig --zig-lib-dir lib --main-pkg-path lib/std --test-cmd wasmtime --test-cmd --dir=. --test-cmd --allow-unknown-exports --test-cmd-bin -target wasm32-wasi
expected error.IsDir, found error.AccessDenied
expected error.PathAlreadyExists, found error.AccessDenied
expected error.IsDir, found error.AccessDenied
23 passed; 17 skipped; 6 failed.
test failureError: failed to run main module `C:\Users\Ryan\Programming\Zig\zig\zig-cache\o\b745ba0bfad688b8844d343fe774832d\test.wasm`

Caused by:
    0: failed to invoke command default
    1: wasm trap: wasm `unreachable` instruction executed
       wasm backtrace:
           0: 0xc725 - os.abort
                           at lib/lib\std/os.zig:500:9
           1: 0x1314 - builtin.default_panic
                           at lib/lib\std/builtin.zig:784:25
           2: 0x3c6b5 - test_runner.main
                           at lib/lib/test_runner.zig:13:30
           3: 0x3c639 - start.callMain
                           at lib/lib\std/start.zig:568:22
                     - _start
                           at lib/lib\std/start.zig:241:42

error: the following test command failed with exit code 3:
wasmtime --dir=. --allow-unknown-exports C:\Users\Ryan\Programming\Zig\zig\zig-cache\o\b745ba0bfad688b8844d343fe774832d\test.wasm

When running all std lib tests:

>zig test lib/std/std.zig --zig-lib-dir lib --main-pkg-path lib/std --test-cmd wasmtime --test-cmd --dir=. --test-cmd --allow-unknown-exports --test-cmd-bin -target wasm32-wasi
expected error.IsDir, found error.AccessDenied
expected error.IsDir, found error.AccessDenied
expected error.IsDir, found error.AccessDenied
expected error.PathAlreadyExists, found error.AccessDenied
expected error.IsDir, found error.AccessDenied
2062 passed; 142 skipped; 10 failed.
test failureError: failed to run main module `C:\Users\Ryan\Programming\Zig\zig\zig-cache\o\888da39961c5da85532636b351c586fb\test.wasm`

Caused by:
    0: failed to invoke command default
    1: wasm trap: wasm `unreachable` instruction executed
       wasm backtrace:
           0: 0x900f - os.abort
                           at lib/lib\std/os.zig:500:9
           1: 0x7bd7 - builtin.default_panic
                           at lib/lib\std/builtin.zig:784:25
           2: 0x7234 - test_runner.main
                           at lib/lib/test_runner.zig:13:30
           3: 0x71b8 - start.callMain
                           at lib/lib\std/start.zig:568:22
                     - _start
                           at lib/lib\std/start.zig:241:42

error: the following test command failed with exit code 3:
wasmtime --dir=. --allow-unknown-exports C:\Users\Ryan\Programming\Zig\zig\zig-cache\o\888da39961c5da85532636b351c586fb\test.wasm

squeek502 avatar Oct 14 '22 11:10 squeek502

Most of the bugs in the test file are caused by testing for the removal of files that are actually a directory.

when wasmtime is called by path_unlink_file it goes through wasm-common unlink_file -> cap-std -> remove_file_or_symlink -> https://github.com/bytecodealliance/cap-std/blob/2636733c047cda304bc53123dff9ca56a953510e/cap-fs-ext/src/dir_ext.rs

calls remove_file from cap-std https://docs.rs/cap-std/latest/src/cap_std/fs/dir.rs.html#346-348 , which then calls cap-primitives's remove_file , which calls remove_file_unchecked

[Note: I might be slightly off with the trace, but I believe the final function call is correct]

this ends up calling rust's fs::remove_file which calls Kernel32!DeleteFile according to rust's std.

Now, because DeleteFile returns error 5 (Access/Permission Denied) and there is no IsDirectory in Kernel32 (although there is an NTSTATUS error for NT functions), the error gets picked up by wasmtime as that.

One way to fix this is probably by calling NtDeleteFile on wasmtime side instead of DeleteFile. Another option is to use other API functions (such as GetFileAttributes or fd_fdstat_get in WASI) to check the attribute of the file and return the appropriate error (See last 2 paragraph).

Since this is caused by an "intended" behavior from the Windows API, I am not sure whether this should be handled on Zig's side or left for the WASM runtime/engine.

Similarly, the open and create functions also acts the same way because they call CreateFile. The only exception for windows is rename, because rename in wasmtime uses rust's fs function, which uses MoveFileEx with the replace flag, which is the same as Zig's behavior for files. MoveFileEx, however, will also fail the same way for directories (https://github.com/bytecodealliance/wasmtime/blob/1321c234e53b0f56ec9ab29d89bdde16af733130/crates/test-programs/wasi-tests/src/bin/path_rename.rs#L63) (* also see: https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-movefileexa)

Another thing I found during debugging this issue is that there are some tests that are marked to not execute on windows. Since wasmtime will eventually call windows functions, they will eventually fail for the same reason.

I added a fix on my fork here and will comment for clarity, but I am not sure if I should submit a PR since the fix includes adding fd_fdstat_get check on every call. This will cause POSIX to do unnecessary checks.

wasmtime is treating functions called in windows environment as windows functions rather than a POSIX functions. Of course, this depends on how the snapshot specs should be interpreted. The specs specify that these functions should be "similar" to those found in POSIX's but does not specify any further(?). Therefore, should they also inherit the same errors as POSIX for consistency or should they inherit errors from their runtime environment?

(side note for people who wants to debug and trace unreported errors: I added a skip on every test and when error went down by one, it meant that I was in the test that caused it, then added print statement in random places to see where it's exactly located. Not sure if there is a better way since try calls would fail with no proper trace)

xEgoist avatar Oct 31 '22 21:10 xEgoist

Thanks for the detailed writeup! It seems like wasmtime and Zig are running into similar problems with regard to a single cross-platform API for filesystem stuff.

This is an unresolved problem with the Zig standard library (see https://github.com/ziglang/zig/issues/6364 for the rename issue you're talking about, and https://github.com/ziglang/zig/issues/6600 for a broader perspective), and it sounds like this might be the case with WASI/wasmtime as well.

The extra stat calls on Zig's side seems unfortunate, since, as I understand it, ironing out these sorts of cross-platform differences would be something that a WASI runtime would have an interest in doing (basically for the same reasons as Zig's std library). If so, then in theory both wasmtime and Zig may eventually converge in terms of their cross-platform API (that is, the stuff wasmtime would need to do to have consistent cross-platform behavior would be similar/the same to what Zig would have to do).

Still unsure how to proceed, since I'm unclear on exactly what wasmtime (or the WASI spec) has to say about these sorts of differences or if there's a plan to address them in the future.

squeek502 avatar Oct 31 '22 22:10 squeek502