go-fuse icon indicating copy to clipboard operation
go-fuse copied to clipboard

READDIRPLUS sometimes doesn't return the right thing

Open ColinChartier opened this issue 3 years ago • 5 comments

If you edit the ReadDir test to look as following:

// ReadDir creates 110 files one by one, checking that we get the expected
// entries after each file creation.
func ReadDir(t *testing.T, mnt string) {
	want := map[string]bool{}
	// 40 bytes of filename, so 110 entries overflows a
	// 4096 page.
	for i := 0; i < 400; i++ {
		nm := fmt.Sprintf("file-rddir-%08x%016x", rand.Uint64()&0xFFFFFFFFFFFFFFFF, i)
		want[nm] = true
		if err := ioutil.WriteFile(filepath.Join(mnt, nm), []byte("hello"), 0644); err != nil {
			t.Fatalf("WriteFile %q: %v", nm, err)
		}
		// Verify that we get the expected entries
		f, err := os.Open(mnt)
		if err != nil {
			t.Fatalf("Open: %v", err)
		}
		names, err := f.Readdirnames(-1)
		if err != nil {
			t.Fatalf("ReadDir: %v", err)
		}
		f.Close()
		got := map[string]bool{}
		for _, e := range names {
			if _, ok := got[e]; ok {
				t.Errorf("Got duplicate name: %v", e)
			}
			got[e] = true
		}
		if len(got) != len(want) {
			t.Errorf("[%d/110] got %d entries, want %d, read %d", i+1, len(got), len(want), len(names))
			var wantBuilder strings.Builder
			for dir := range want {
				if _, ok := got[dir]; !ok {
					wantBuilder.WriteString(dir)
					wantBuilder.WriteString(", ")
				}
			}
			t.Error("Missing: ", wantBuilder.String())
			return
		}
		for k := range got {
			if !want[k] {
				t.Errorf("got unknown name %q", k)
			}
		}
	}
}

(e.g., warn of duplicates and do 400 iterations)

It will show missing entries and an equal number of duplicates. I assume this is a race condition, but is reproducible on my laptop.

This is actually happening with loopbackfs under heavy read load (e.g., an esbuild)

Excerpt of log:

00:47:46.519947 tx 4614:     OK,  1600b  ".rg`rg`rg`aaa+file-rddir-76a780ea967cd71000000000000000485/rg`rg`rg`glglgl+file-rddir-b46e5f29ae10a901000000000000006f3/rg`rg`rg`+file-rddir-53b7a0ff70658b9400000000000000a1.rg`rg`rg`|!|!|!+file-rddir-5ef4e81ede4561ad000000000000003dm-/rg`rg`rg`0mm+file"...
00:47:46.519967 rx 4616: READDIRPLUS n1 {Fh 1 [169 +4096)  L 0 RDONLY,0x8000} 
00:47:46.519970 tx 4616:     OK
    test.go:490: Got duplicate name: file-rddir-803e6306563b26de0000000000000071
    test.go:490: Got duplicate name: file-rddir-4fd269fd8e5f04750000000000000067
    test.go:490: Got duplicate name: file-rddir-89c09a3e6f241c5700000000000000a6
    test.go:490: Got duplicate name: file-rddir-1facfc5dc1540864000000000000007e
    test.go:490: Got duplicate name: file-rddir-2e3108dabb1586440000000000000015
    test.go:490: Got duplicate name: file-rddir-cbdb185b70ab53ba0000000000000080
    test.go:490: Got duplicate name: file-rddir-1352ca320796a7100000000000000089
    test.go:490: Got duplicate name: file-rddir-fd4d8e9fa5ead0bd0000000000000044
    test.go:490: Got duplicate name: file-rddir-39f6f78a15d523b000000000000001f
    test.go:490: Got duplicate name: file-rddir-3fba24704399a363000000000000006d
    test.go:490: Got duplicate name: file-rddir-2d89c820f5c353cf000000000000008a
    test.go:490: Got duplicate name: file-rddir-c93a23eb55ece0ea0000000000000094
    test.go:490: Got duplicate name: file-rddir-3fcb626c3f1d7427000000000000008e
    test.go:490: Got duplicate name: file-rddir-835ca80d72521a90000000000000007a
    test.go:490: Got duplicate name: file-rddir-f062b076752f33ff0000000000000077
    test.go:490: Got duplicate name: file-rddir-1e925507c051e18a000000000000005f
    test.go:490: Got duplicate name: file-rddir-41f27cc6f3875d04000000000000000a
    test.go:490: Got duplicate name: file-rddir-76a780ea967cd7100000000000000048
    test.go:490: Got duplicate name: file-rddir-b46e5f29ae10a901000000000000006f
    test.go:490: Got duplicate name: file-rddir-46baf44754e0c0c10000000000000097
    test.go:490: Got duplicate name: file-rddir-e83e14a538d3b4940000000000000081
    test.go:490: Got duplicate name: file-rddir-6688f8bd3e99b0a8000000000000003e
    test.go:490: Got duplicate name: file-rddir-4542c29291018d7c0000000000000029
    test.go:495: [167/110] got 143 entries, want 167, read 166
    test.go:503: Missing:  file-rddir-6e661e92759805f50000000000000017, file-rddir-5d78399cbed80a3a000000000000003f, file-rddir-af2560383d179090000000000000058, file-rddir-c66b36dbc9b344e90000000000000066, file-rddir-f2d0a1e9406aec9d0000000000000042, file-rddir-e85f325712f4128d0000000000000076, file-rddir-76fd839a1e094f9a0000000000000093, file-rddir-58cfb024878d40630000000000000082, file-rddir-528047936c9c64b70000000000000057, file-rddir-293a0c2bf9fc65680000000000000055, file-rddir-f2bb5389421657ff0000000000000054, file-rddir-56ec3f25256321860000000000000012, file-rddir-b7fbfbcafd915bb000000000000008f, file-rddir-1852fe624c330f1d000000000000009f, file-rddir-ac89c16725fd3d7f000000000000004a, file-rddir-b629d9f17d9e8fbc000000000000002c, file-rddir-9e503382be14186f000000000000005c, file-rddir-7539a2019f06397d0000000000000073, file-rddir-90b166a221898f34000000000000009e, file-rddir-5d55cb471941e52a0000000000000068, file-rddir-3485757ea7ed5d970000000000000063, file-rddir-207403def63a5b6f000000000000002e, file-rddir-90b92d0169a391440000000000000052, file-rddir-d0031d27b8b352f70000000000000065, 

ColinChartier avatar Apr 03 '21 05:04 ColinChartier

Hi, cannot repro with https://github.com/rfjakob/go-fuse/commit/4a1ef4030633bfc2c6bd3a94dbd655d40f7d8cb7

0 jakob@brikett:~/go/src/github.com/hanwen/go-fuse/fs$ go test -v -run "TestPosix/ReadDir$"
=== RUN   TestPosix
=== RUN   TestPosix/ReadDir
--- PASS: TestPosix (0.62s)
    --- PASS: TestPosix/ReadDir (0.62s)
PASS
ok  	github.com/hanwen/go-fuse/v2/fs	0.621s

Maybe kernel and/or Go version dependent?

$ go version
go version go1.16.2 linux/amd64
$ uname -a
Linux brikett 5.11.10-200.fc33.x86_64 #1 SMP Thu Mar 25 16:51:31 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

rfjakob avatar Apr 03 '21 08:04 rfjakob

I was running it with: go test -count=1 ./... -test.v -test.run TestPosix |& tee /tmp/testoutput

Also on go version go1.16.2 linux/amd64 5.8.0-48-generic #54-Ubuntu SMP Fri Mar 19 14:25:20 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

ColinChartier avatar Apr 03 '21 15:04 ColinChartier

Also for more data - it still happens with a simplified DirStream method that just reads the whole directory contents into memory using stdlib, so I think it's somewhere else (bridge.go?)

ColinChartier avatar Apr 03 '21 15:04 ColinChartier

I was experiencing the same error and debugged it in separate test before finding this issue. I've pushed a stand-alone test here which fails reliable on my laptop (linux 5.11.0-18-generic, go 1.16.5) and a ci server (linux 4.19.181-1, go 1.16.5).

The problem I identified lies in bridge.go as @ColinChartier suggested. If there is no NodeReaddirer implementation the Children() map is iterated and added as a slice in ListDirStream. This is a problem as map iteration is not stable and a succeeding call to ReadDirPlus with an increased offset might end up with a different ListDirStream.

I've verified this finding by sorting the slice before returning it in a ListDirStream - this fixed the issue at least in my unit test.

schwittmann avatar Jun 09 '21 10:06 schwittmann

thanks for the report. That sounds simple to fix, but now I regret the default implementation here, as the sort can be expensive.

hanwen avatar Jun 09 '21 13:06 hanwen

sorry for letting this languish for a long time. This is actually a serious issue. I have a fix out in https://review.gerrithub.io/c/hanwen/go-fuse/+/552116

hanwen avatar Apr 03 '23 12:04 hanwen

I dont' understand the first comment though: the loopback filesystem doesn't use the default readdir implementation, so the random map iteration order can't be the cause of the problem.

hanwen avatar Apr 03 '23 12:04 hanwen

newunionfs still fails TestPosix/ReadDir: https://github.com/hanwen/go-fuse/issues/471.

navytux avatar Jun 08 '23 16:06 navytux