flux-core icon indicating copy to clipboard operation
flux-core copied to clipboard

testsuite: work around for statedir being set on test_under_flux line with NFS dir and using `full` personality

Open chu11 opened this issue 2 years ago • 1 comments

Not necessarily important to fix or workaround, but after how long it took for me to figure this out, I figure I should atleast document this. Updated a test with:

test_under_flux 2 full -o,-Sstatedir=$(pwd)

then after this, the test repeating fails with:

rm: cannot remove '/g/g0/achu/chaos/git/flux-framework/flux-core/t/trash-directory.t2805-startlog-cmd': Directory not empty
FATAL: Cannot prepare test area
2022-08-04T17:50:15.253218Z broker.err[0]: rc2.0: sh ./t2805-startlog-cmd.t  --verbose --debug Exited (rc=1) 0.4s
flux-start: 0 (pid 126593) exited with rc=1

After finally figuring out how the internals of test_under_flux worked, added some debug.

total 140
drwxr-xr-x  3 achu achu  4096 Aug  4 17:50 .
drwxr-xr-x 44 achu achu 65536 Aug  4 17:50 ..
drwx------  3 achu achu  4096 Aug  4 17:50 .local
-rw-r--r--  1 achu achu  1024 Aug  4 17:50 content.sqlite
-rw-r--r--  1 achu achu 55576 Aug  4 17:50 content.sqlite-wal
rm: cannot remove '/g/g0/achu/chaos/git/flux-framework/flux-core/t/trash-directory.t2805-startlog-cmd': Directory not empty
FATAL: Cannot prepare test area
total 136
drwxr-xr-x  2 achu achu  4096 Aug  4 17:50 .
drwxr-xr-x 44 achu achu 65536 Aug  4 17:50 ..
-rw-r--r--  1 achu achu  1024 Aug  4 17:50 .nfs000000005c10f8c200000274
-rw-r--r--  1 achu achu 55576 Aug  4 17:50 .nfs000000005c10f8c300000275
2022-08-04T17:50:15.253218Z broker.err[0]: rc2.0: sh ./t2805-startlog-cmd.t  --verbose --debug Exited (rc=1) 0.4s
flux-start: 0 (pid 126593) exited with rc=1

so short answer, using full personality, content-sqlite files are created when flux broker is started under test_under_flux, when the test file is run again (under same process), tries to cleanup trash path, removes content-sqlite stuff, but b/c we're still in the process, renames files .nfsXXX, thus can't remove the directory, thus error.

So if statedir is set with test_under_flux, care should be used to use the non-full personality and you have full control of what's going on ... or don't use a NFS dir.

chu11 avatar Aug 04 '22 18:08 chu11

The system personality of test_under_flux does the proper workaround of making a "wrapper" trash directory so that this can work:

https://github.com/flux-framework/flux-core/blob/7b8447a2882d2208b3e7bf997ec2ea0bffde61a6/t/sharness.d/flux-sharness.sh#L222-L231

grondo avatar Aug 04 '22 19:08 grondo