gomft
gomft copied to clipboard
Sparse data runs handling bug
When offsetLength
is zero, it has special semantic meaning (= sparse file): https://github.com/t9t/gomft/blob/f64df3912ca98b2abeaa2f9e79f7c8a59b159ba9/mft/mft.go#L365
(source, see "Example 4 - Sparse, Unfragmented File")
This code decodes a sparse offset as 0, which would incorrectly read a fragment beginning from first cluster of volume (= VBR).
My initial naive idea was to mark sparse runs as having offset -1, but that's incorrect as each run's offset is relative to previous run, i.e. negative values are legal. I solved this issue in my code by making the offset a pointer to an integer, i.e. nil
means it's sparse.
Directly related to this, DataRunsToFragments
needs to take sparse runs into account. Data run offsets are relative previous non-sparse run, not previous run: https://github.com/t9t/gomft/blob/f64df3912ca98b2abeaa2f9e79f7c8a59b159ba9/mft/mft.go#L390
(link to "Example 4 - Sparse, Unfragmented File" explains this)
This assignment can only be done if current run is non-sparse. I suggest renaming the variable to e.g. previousNonSparseOffsetCluster
.
Natural consequence of sparse run support is supporting sparse fragments, which I again marked as *int
to discriminate between sparse and non-sparse runs. I had to re-write fragment.Reader
in my project because alternating between actual disk-backed fragments vs sparse fragments would've been hackish to add without a re-write.