littlefs icon indicating copy to clipboard operation
littlefs copied to clipboard

Fix synthetic move underflows in lfs_dir_get

Open geky opened this issue 1 year ago • 2 comments

By "luck" the previous code somehow managed to not be broken, though it was possible to traverse the same file twice in lfs_fs_traverse/size (which is not an error).

The problem was an underlying assumption in lfs_dir_get that it would never be called when the requested id is pending removal because of a powerloss. The assumption was either:

  1. lfs_dir_find would need to be called first to find the id, and it would correctly toss out pending-rms with LFS_ERR_NOENT.

  2. lfs_fs_mkconsistent would be implicitly called before any filesystem traversals, cleaning up any pending-rms. This is at least true for allocator scans.

But, as noted by @andriyndev, both lfs_fs_traverse and lfs_fs_size can call lfs_fs_get with a pending-rm id if called in a readonly context.


By "luck" this somehow manages to not break anything:

  1. If the pending-rm id is >0, the id is decremented by 1 in lfs_fs_get, returning the previous file entry during traversal. Worst case, this reports any blocks owned by the previous file entry twice.

    Note this is not an error, lfs_fs_traverse/size may return the same block multiple times due to underlying copy-on-write structures.

  2. More concerning: If the pending-rm id is 0, the id is decremented by 1 in lfs_fs_get and underflows. This underflow propagates into the type field of the tag we are searching for, decrementing it from 0x200 (LFS_TYPE_STRUCT) to 0x1ff (LFS_TYPE_INTERNAL(UNUSED)).

    Fortunately, since this happens to underflow to the INTERNAL tag type, the type intended to never exist on disk, we should never find a matching tag during our lfs_fs_get search. The result? lfs_dir_get returns LFS_ERR_NOENT, which is actually what we want.

Also note that LFS_ERR_NOENT does not terminate the mdir traversal early. If it did we would have missed files instead of duplicating files, which is a slightly worse situation.


The fix is to add an explicit check for pending-rms in lfs_dir_get, just like in lfs_dir_find. This avoids relying on unintended underflow propagation, and should make the internal API behavior more consistent.

This is especially important for potential future gc extensions.

Related issue: https://github.com/littlefs-project/littlefs/issues/920 Found by: @andriyndev

geky avatar Feb 04 '24 21:02 geky

Tests passed ✓, Code: 16980 B (+0.0%), Stack: 1432 B (+0.0%), Structs: 812 B (+0.0%)
Code Stack Structs Coverage
Default 16980 B (+0.0%) 1432 B (+0.0%) 812 B (+0.0%) Lines 2387/2567 lines (-0.0%)
Readonly 6190 B (+0.1%) 448 B (+0.0%) 812 B (+0.0%) Branches 1240/1578 branches (-0.0%)
Threadsafe 17844 B (+0.0%) 1432 B (+0.0%) 820 B (+0.0%) Benchmarks
Multiversion 17044 B (+0.0%) 1432 B (+0.0%) 816 B (+0.0%) Readed 29369693876 B (+0.0%)
Migrate 18672 B (+0.0%) 1736 B (+0.0%) 816 B (+0.0%) Proged 1482874766 B (+0.0%)
Error-asserts 17668 B (+0.0%) 1424 B (+0.0%) 812 B (+0.0%) Erased 1568888832 B (+0.0%)

geky-bot avatar Feb 04 '24 21:02 geky-bot

LGTM

andriyndev avatar Feb 05 '24 11:02 andriyndev

Sorry for the delay, this should be merged now assuming CI passes.

geky avatar Mar 08 '24 22:03 geky