gdal icon indicating copy to clipboard operation
gdal copied to clipboard

Test failures with bcachefs

Open Chwiggy opened this issue 1 year ago • 7 comments

What is the bug?

During the build process of GDAL 3.8.5 30 to 40 pytests fail on systems running on bcachefs. By majority these are tests concerning checksums on files involved in I/O operations.

This bug is not reproducible on other filesystems, like ext4

Steps to reproduce the issue

Tentative: Try and build gdal 3.8.5 on a machine running bcachefs.

Versions and provenance

OS info

- system: `"x86_64-linux"`
 - host os: `Linux 6.8.8, NixOS, 24.05 (Uakari), 24.05.20240502.63c3a29`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.2`
 - nixpkgs: `/nix/store/p69bcs7ma6ijj8v9xsrg3nq3nn8ryn95-source`
 - filesystem: bcachefs

GDAL version

3.8.5

Provenance

nixpkgs:unstable

Additional context

relevant issue on nixpkgs: https://github.com/NixOS/nixpkgs/issues/302137 full build logs: https://gist.github.com/Chwiggy/e18dcb59ae47b78c3edbcd965d09d6ac

Chwiggy avatar May 09 '24 14:05 Chwiggy

couldn't that be a defect (or "feature") of bcachefs itself, given that the tests pass on a variety of more standard filesystems?

rouault avatar May 09 '24 14:05 rouault

could very well be. Would still be useful for someone who has more insight into gdal's testing setup, to figure out if these failed tests break because the test setup makes broken assumptions about how filesystems behave, if these tests might be skipped or need to be modified if building against specific filesystems, or if this is a flaw with bcachefs, that needs to be fixed within bcachefs

Chwiggy avatar May 09 '24 14:05 Chwiggy

also notably, this seems to not have been an issue before gdal 3.8.4, and gdal 3.8.4 built with a bit of coercion in the end.

Chwiggy avatar May 09 '24 14:05 Chwiggy

also notably, this seems to not have been an issue before gdal 3.8.4

could you run a git bisect session then ?

rouault avatar May 09 '24 14:05 rouault

https://github.com/NixOS/nixpkgs/issues/302137#issuecomment-2225820147

=========================== short test summary info ============================
FAILED gcore/vsiaz.py::test_vsiaz_fake_write - TypeError: Failed expected string as 'msg' parameter, got 'int' instead.
FAILED gcore/vsioss.py::test_vsioss_6 - AssertionError: (1, 2)

It would be very surprising if a Python type confusing string and int was caused by the underlying file sytem.

@Chwiggy Does NixOS Hydra even use bcachefs?

nh2 avatar Jul 12 '24 15:07 nh2

FAILED gcore/vsiaz.py::test_vsiaz_fake_write - TypeError: Failed expected string as 'msg' parameter, got 'int' instead. FAILED gcore/vsioss.py::test_vsioss_6 - AssertionError: (1, 2)

Some of those vsi tests are known to randomly fail for unknown reasons

rouault avatar Jul 12 '24 15:07 rouault

@rouault Do you have a list ones of all those that are flaky?

We should disable them in NixOS (and all other Linux distributions likely want to do the same).

It might even make sense to disable them in the upstream package, adding a flag or so to enable flaky tests.

nh2 avatar Jul 12 '24 16:07 nh2