go-fuse icon indicating copy to clipboard operation
go-fuse copied to clipboard

unionfs: Tests are flaky

Open navytux opened this issue 5 years ago • 14 comments

Hello up there. I've tried to run go-fuse tests and from time to time there is unionfs failure. It happens with both Go1.10 and Go1.11. My kernel is Linux deco 4.18.0-1-amd64 #1 SMP Debian 4.18.6-1 (2018-09-06) x86_64 GNU/Linux.

Thanks beforehand, Kirill

(neo) (z-dev) (g.env) kirr@deco:~/src/neo/src/github.com/hanwen/go-fuse$ go test -count 1 ./...
ok      github.com/hanwen/go-fuse/benchmark     0.070s
?       github.com/hanwen/go-fuse/benchmark/bulkstat    [no test files]
?       github.com/hanwen/go-fuse/example/autounionfs   [no test files]
?       github.com/hanwen/go-fuse/example/benchmark-read-throughput     [no test files]
?       github.com/hanwen/go-fuse/example/hello [no test files]
?       github.com/hanwen/go-fuse/example/loopback      [no test files]
?       github.com/hanwen/go-fuse/example/memfs [no test files]
?       github.com/hanwen/go-fuse/example/multizip      [no test files]
?       github.com/hanwen/go-fuse/example/statfs        [no test files]
?       github.com/hanwen/go-fuse/example/unionfs       [no test files]
?       github.com/hanwen/go-fuse/example/zipfs [no test files]
ok      github.com/hanwen/go-fuse/fuse  0.019s
ok      github.com/hanwen/go-fuse/fuse/nodefs   0.186s
ok      github.com/hanwen/go-fuse/fuse/pathfs   0.197s
ok      github.com/hanwen/go-fuse/fuse/test     2.403s
?       github.com/hanwen/go-fuse/internal/testutil     [no test files]
?       github.com/hanwen/go-fuse/internal/utimens      [no test files]
ok      github.com/hanwen/go-fuse/splice        0.021s
2018/10/01 11:34:01 Adding workspace manual1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink031474821/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink031474821/store/backing1/READONLY))])
2018/10/01 11:34:01 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink031474821/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink031474821/store/backing1/READONLY))])
2018/10/01 11:34:01 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestDetectSymlinkedDirectories629252640/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestDetectSymlinkedDirectories629252640/store/backing1/READONLY))])
2018/10/01 11:34:01 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestExplicitScan219492607/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestExplicitScan219492607/store/backing1/READONLY))])
2018/10/01 11:34:01 Adding workspace bar for roots UnionFs([LoopbackFs(/tmp/TestCreationChecks552536658/store/foo) cachingFileSystem(LoopbackFs(/tmp/TestCreationChecks552536658/store/foo/READONLY))])
2018/10/01 11:34:01 Already have a union FS for directory /tmp/TestCreationChecks552536658/store/foo in workspace bar
--- FAIL: TestUnionFsOpenUndeletes (0.02s)
    unionfs_test.go:439: Lstat failed: lstat /tmp/TestUnionFsOpenUndeletes596243470/mnt/file: no such file or directory
2018/10/01 11:34:03 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropDeletionCache090349530/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache090349530/ro))])
2018/10/01 11:34:03 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache090349530/ro))
2018/10/01 11:34:03 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropCache993992561/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache993992561/ro))])
2018/10/01 11:34:03 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache993992561/ro))
2018/10/01 11:34:03 error accessing deletion marker DELETIONS/d41d8cd98f00b204-file2: 20=not a directory
2018/10/01 11:34:04 Found linked inode, but Nlink == 1, ino=265533, fullPath="file"
FAIL
FAIL    github.com/hanwen/go-fuse/unionfs       2.757s
ok      github.com/hanwen/go-fuse/zipfs 0.123s

navytux avatar Oct 01 '18 08:10 navytux

As of today's master (38601914) unionfs tests became less flaky - probably thanks to 4e73fb9a. However longer go test run in unionfs still shows problems:

(neo) (z-dev) (g.env) kirr@deco:~/src/neo/src/github.com/hanwen/go-fuse/unionfs$ while true; do go test -count 1 .; done
ok      github.com/hanwen/go-fuse/unionfs       2.990s
ok      github.com/hanwen/go-fuse/unionfs       2.860s
ok      github.com/hanwen/go-fuse/unionfs       3.024s
ok      github.com/hanwen/go-fuse/unionfs       3.032s
ok      github.com/hanwen/go-fuse/unionfs       3.008s
ok      github.com/hanwen/go-fuse/unionfs       2.909s
ok      github.com/hanwen/go-fuse/unionfs       2.817s
ok      github.com/hanwen/go-fuse/unionfs       2.885s
ok      github.com/hanwen/go-fuse/unionfs       2.941s
ok      github.com/hanwen/go-fuse/unionfs       2.893s
ok      github.com/hanwen/go-fuse/unionfs       2.974s
ok      github.com/hanwen/go-fuse/unionfs       2.986s
ok      github.com/hanwen/go-fuse/unionfs       2.847s
ok      github.com/hanwen/go-fuse/unionfs       3.029s
ok      github.com/hanwen/go-fuse/unionfs       2.918s
ok      github.com/hanwen/go-fuse/unionfs       2.790s
ok      github.com/hanwen/go-fuse/unionfs       2.917s
ok      github.com/hanwen/go-fuse/unionfs       2.896s
ok      github.com/hanwen/go-fuse/unionfs       2.899s
ok      github.com/hanwen/go-fuse/unionfs       2.955s
ok      github.com/hanwen/go-fuse/unionfs       2.780s
ok      github.com/hanwen/go-fuse/unionfs       2.978s
ok      github.com/hanwen/go-fuse/unionfs       2.861s
ok      github.com/hanwen/go-fuse/unionfs       2.931s
ok      github.com/hanwen/go-fuse/unionfs       2.808s
ok      github.com/hanwen/go-fuse/unionfs       2.928s
ok      github.com/hanwen/go-fuse/unionfs       2.984s
ok      github.com/hanwen/go-fuse/unionfs       3.000s
ok      github.com/hanwen/go-fuse/unionfs       2.809s
ok      github.com/hanwen/go-fuse/unionfs       2.776s
ok      github.com/hanwen/go-fuse/unionfs       3.043s
12:04:54.786983 Adding workspace manual1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink973449315/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink973449315/store/backing1/READONLY))])
12:04:54.789833 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink973449315/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink973449315/store/backing1/READONLY))])
12:04:54.824455 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestDetectSymlinkedDirectories156964966/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestDetectSymlinkedDirectories156964966/store/backing1/READONLY))])
12:04:54.854362 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestExplicitScan881919629/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestExplicitScan881919629/store/backing1/READONLY))])
12:04:54.871584 Adding workspace bar for roots UnionFs([LoopbackFs(/tmp/TestCreationChecks786299528/store/foo) cachingFileSystem(LoopbackFs(/tmp/TestCreationChecks786299528/store/foo/READONLY))])
12:04:54.871647 Already have a union FS for directory /tmp/TestCreationChecks786299528/store/foo in workspace bar
--- FAIL: TestUnionFsRename (0.44s)
    unionfs_test.go:525: Config 0 {true false false false}
    unionfs_test.go:525: Config 1 {false true false false}
    unionfs_test.go:525: Config 2 {true true false false}
    unionfs_test.go:525: Config 3 {true false true false}
    unionfs_test.go:525: Config 4 {false true true false}
    unionfs_test.go:525: Config 5 {true true true false}
    unionfs_test.go:525: Config 6 {true false false true}
    unionfs_test.go:525: Config 7 {false true false true}
    unionfs_test.go:525: Config 8 {true true false true}
    unionfs_test.go:547: Should have lost file1
    unionfs_test.go:525: Config 9 {true false true true}
    unionfs_test.go:525: Config 10 {false true true true}
    unionfs_test.go:525: Config 11 {true true true true}
12:04:56.747300 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropDeletionCache590801872/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache590801872/ro))])
12:04:56.747379 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache590801872/ro))
12:04:56.791006 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropCache392033007/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache392033007/ro))])
12:04:56.791029 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache392033007/ro))
12:04:56.975118 error accessing deletion marker DELETIONS/d41d8cd98f00b204-file2: 20=not a directory
12:04:57.386796 Found linked inode, but Nlink == 1, ino=416508, fullPath="file"
FAIL
FAIL    github.com/hanwen/go-fuse/unionfs       2.817s
ok      github.com/hanwen/go-fuse/unionfs       3.052s
ok      github.com/hanwen/go-fuse/unionfs       2.748s
ok      github.com/hanwen/go-fuse/unionfs       2.996s
ok      github.com/hanwen/go-fuse/unionfs       2.894s
ok      github.com/hanwen/go-fuse/unionfs       2.857s
ok      github.com/hanwen/go-fuse/unionfs       2.807s
ok      github.com/hanwen/go-fuse/unionfs       2.739s
ok      github.com/hanwen/go-fuse/unionfs       2.892s
^C^Z

navytux avatar Oct 10 '18 09:10 navytux

can you check with current master? I fixed a source of flakiness.

hanwen avatar Jan 30 '19 19:01 hanwen

the fix is https://github.com/hanwen/go-fuse/commit/e6fa1a12fd97f29a6833d6c69d7b94438e5e3fed

hanwen avatar Jan 30 '19 19:01 hanwen

I can no longer reproduce this-issue original failure with current master (32b02fb6).

However while running go-fuse$ while true; do go test -count 1 ./...; done I saw

--- FAIL: TestStatFs (0.02s)
    loopback_test.go:837: statfs mismatch syscall.Statfs_t{Type:0, Bsize:4096, Blocks:0xeef91, Bfree:0xe5bf3, Bavail:0xe5bf3, Files:0xeef91, Ffree:0xeed58, Fsid:syscall.Fsid{X__val:[2]int32{0, 0}}, Namelen:255, Frsize:4096, Flags:0, Spare:[4]int64{0, 0, 0, 0}} != syscall.Statfs_t{Type:0, Bsize:4096, Blocks:0xeef91, Bfree:0xe5bf3, Bavail:0xe5bf3, Files:0xeef91, Ffree:0xeed55, Fsid:syscall.Fsid{X__val:[2]int32{0, 0}}, Namelen:255, Frsize:4096, Flags:0, Spare:[4]int64{0, 0, 0, 0}}
FAIL
FAIL    github.com/hanwen/go-fuse/fuse/test     2.911s

a couple of times, and

--- FAIL: TestFopenKeepCache (0.03s)
    cache_test.go:108: ReadFile: open /tmp/TestFopenKeepCache129364302/mnt/file.txt: no such file or directory
11:09:39.709523 Unimplemented opcode INTERRUPT
11:09:39.709594 writer: Write/Writev failed, err: 2=no such file or directory. opcode: INTERRUPT
FAIL
FAIL    github.com/hanwen/go-fuse/fuse/test     2.984s

once, without any unionfs failure. go test -race also seems to be happy though it stuck once somewhere...

I think we can close this issue regarding unionfs flakiness and open more specific ones for found specific subjects.

Thanks for fixing unionfs flakines!

navytux avatar Jan 31 '19 08:01 navytux

TestFopenKeepCache: https://github.com/hanwen/go-fuse/issues/168.

navytux avatar Jan 31 '19 08:01 navytux

Oops, while rerunning go test -count 1 in a loop I also got once:

11:23:18.330232 Adding workspace manual1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink793491542/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink793491542/store/backing1/READONLY))])
11:23:18.337294 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink793491542/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink793491542/store/backing1/READONLY))])
11:23:18.369028 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestDetectSymlinkedDirectories848987069/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestDetectSymlinkedDirectories848987069/store/backing1/READONLY))])
11:23:18.393123 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestExplicitScan384464376/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestExplicitScan384464376/store/backing1/READONLY))])
11:23:18.416869 Adding workspace bar for roots UnionFs([LoopbackFs(/tmp/TestCreationChecks452261623/store/foo) cachingFileSystem(LoopbackFs(/tmp/TestCreationChecks452261623/store/foo/READONLY))])
11:23:18.416933 Already have a union FS for directory /tmp/TestCreationChecks452261623/store/foo in workspace bar
--- FAIL: TestUnionFsDelete (0.02s)
    unionfs_test.go:316: should have disappeared.
11:23:19.964491 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropDeletionCache664380319/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache664380319/ro))])
11:23:19.964521 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache664380319/ro))
11:23:19.984223 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropCache895079026/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache895079026/ro))])
11:23:19.984244 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache895079026/ro))
11:23:20.168447 error accessing deletion marker DELETIONS/d41d8cd98f00b204-file2: 20=not a directory
11:23:20.573959 Found linked inode, but Nlink == 1, ino=484385, fullPath="file"
FAIL
FAIL    github.com/hanwen/go-fuse/unionfs       2.413s

navytux avatar Jan 31 '19 08:01 navytux

Trying to reproduce TestStatFs failure I got go test (without -race) stuck somewhere:

.../src/github.com/hanwen/go-fuse$ while true; do go test -count 1 ./...; done
ok      github.com/hanwen/go-fuse/benchmark     0.104s
?       github.com/hanwen/go-fuse/benchmark/bulkstat    [no test files]
?       github.com/hanwen/go-fuse/example/autounionfs   [no test files]
?       github.com/hanwen/go-fuse/example/benchmark-read-throughput     [no test files]
...

ok      github.com/hanwen/go-fuse/fuse  0.026s
ok      github.com/hanwen/go-fuse/fuse/nodefs   0.491s
ok      github.com/hanwen/go-fuse/fuse/pathfs   0.484s



# repeating again

ok      github.com/hanwen/go-fuse/fuse/test     8.806s
?       github.com/hanwen/go-fuse/internal/testutil     [no test files]
?       github.com/hanwen/go-fuse/internal/utimens      [no test files]
ok      github.com/hanwen/go-fuse/splice        0.015s
ok      github.com/hanwen/go-fuse/unionfs       8.772s
ok      github.com/hanwen/go-fuse/zipfs 0.399s
ok      github.com/hanwen/go-fuse/benchmark     0.063s
?       github.com/hanwen/go-fuse/benchmark/bulkstat    [no test files]
?       github.com/hanwen/go-fuse/example/autounionfs   [no test files]
?       github.com/hanwen/go-fuse/example/benchmark-read-throughput     [no test files]
?       github.com/hanwen/go-fuse/example/hello [no test files]
?       github.com/hanwen/go-fuse/example/loopback      [no test files]
?       github.com/hanwen/go-fuse/example/memfs [no test files]
?       github.com/hanwen/go-fuse/example/multizip      [no test files]
?       github.com/hanwen/go-fuse/example/statfs        [no test files]
?       github.com/hanwen/go-fuse/example/unionfs       [no test files]
?       github.com/hanwen/go-fuse/example/zipfs [no test files]
ok      github.com/hanwen/go-fuse/fuse  0.037s
ok      github.com/hanwen/go-fuse/fuse/nodefs   0.244s
ok      github.com/hanwen/go-fuse/fuse/pathfs   0.237s

# no output after this

navytux avatar Jan 31 '19 08:01 navytux

(It cannot be killed with Ctrl+C and I have to do something like echo 1 > /sys/fs/fuse/connections/57/abort to interrupt it)

navytux avatar Jan 31 '19 08:01 navytux

Another unionfs failure:

src/github.com/hanwen/go-fuse$ while true; do echo -e '\n# repeat'; go test -count 1 ./...; done
...

# repeat
ok      github.com/hanwen/go-fuse/benchmark     0.065s
?       github.com/hanwen/go-fuse/benchmark/bulkstat    [no test files]
?       github.com/hanwen/go-fuse/example/autounionfs   [no test files]
?       github.com/hanwen/go-fuse/example/benchmark-read-throughput     [no test files]
?       github.com/hanwen/go-fuse/example/hello [no test files]
?       github.com/hanwen/go-fuse/example/loopback      [no test files]
?       github.com/hanwen/go-fuse/example/memfs [no test files]
?       github.com/hanwen/go-fuse/example/multizip      [no test files]
?       github.com/hanwen/go-fuse/example/statfs        [no test files]
?       github.com/hanwen/go-fuse/example/unionfs       [no test files]
?       github.com/hanwen/go-fuse/example/zipfs [no test files]
ok      github.com/hanwen/go-fuse/fuse  0.002s
ok      github.com/hanwen/go-fuse/fuse/nodefs   0.218s
ok      github.com/hanwen/go-fuse/fuse/pathfs   0.252s
ok      github.com/hanwen/go-fuse/fuse/test     2.974s
?       github.com/hanwen/go-fuse/internal/testutil     [no test files]
?       github.com/hanwen/go-fuse/internal/utimens      [no test files]
ok      github.com/hanwen/go-fuse/splice        0.012s
11:34:19.631397 Adding workspace manual1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink815492857/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink815492857/store/backing1/READONLY))])
11:34:19.634485 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestAutoFsSymlink815492857/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestAutoFsSymlink815492857/store/backing1/READONLY))])
11:34:19.661441 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestDetectSymlinkedDirectories401704452/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestDetectSymlinkedDirectories401704452/store/backing1/READONLY))])
11:34:19.690152 Adding workspace backing1 for roots UnionFs([LoopbackFs(/tmp/TestExplicitScan852794259/store/backing1) cachingFileSystem(LoopbackFs(/tmp/TestExplicitScan852794259/store/backing1/READONLY))])
11:34:19.711542 Adding workspace bar for roots UnionFs([LoopbackFs(/tmp/TestCreationChecks096603606/store/foo) cachingFileSystem(LoopbackFs(/tmp/TestCreationChecks096603606/store/foo/READONLY))])
11:34:19.711607 Already have a union FS for directory /tmp/TestCreationChecks096603606/store/foo in workspace bar
--- FAIL: TestUnionFsRmdirMkdir (0.02s)
    unionfs_test.go:502: Mkdir: mkdir /tmp/TestUnionFsRmdirMkdir896934563/mnt/subdir: no such file or directory
11:34:21.620956 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropDeletionCache909636062/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache909636062/ro))])
11:34:21.621032 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropDeletionCache909636062/ro))
11:34:21.664786 Forced cache drop on UnionFs([LoopbackFs(/tmp/TestUnionFsDropCache864178597/rw) cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache864178597/ro))])
11:34:21.664838 Dropping cache for cachingFileSystem(LoopbackFs(/tmp/TestUnionFsDropCache864178597/ro))
11:34:21.853551 error accessing deletion marker DELETIONS/d41d8cd98f00b204-file2: 20=not a directory
11:34:22.337754 Found linked inode, but Nlink == 1, ino=621924, fullPath="file"
FAIL
FAIL    github.com/hanwen/go-fuse/unionfs       2.895s
ok      github.com/hanwen/go-fuse/zipfs 0.113s

...

navytux avatar Jan 31 '19 08:01 navytux

Tests getting stuck: https://github.com/hanwen/go-fuse/issues/261.

navytux avatar Jan 31 '19 08:01 navytux

Re TestStatFs failure - sadly I no longer can reproduce it. Once I will be able to reproduce it reliably - I will file a new issue.

navytux avatar Jan 31 '19 08:01 navytux

About unionfs flakiness: sadly given we still have issues with TestUnionFsDelete (https://github.com/hanwen/go-fuse/issues/233#issuecomment-459257198) and TestUnionFsRmdirMkdir (https://github.com/hanwen/go-fuse/issues/233#issuecomment-459260389) I think we cannot close this issue yet.

navytux avatar Jan 31 '19 08:01 navytux

StatFS is "intended" to be flaky - it checks StatFS through fuse against the real StatFS. If you have a background process that uses space on the partition it can change StatFS results, which can race with the FUSE test.

hanwen avatar Jan 31 '19 12:01 hanwen

Thanks for clarifying.

navytux avatar Jan 31 '19 15:01 navytux

see f4cec619855dbaa6dd87c0104cbc220ea8650a63

hanwen avatar Apr 09 '23 17:04 hanwen