Support sparse files?
I'm copying some sparse files (VMDKs created by VMware). The apparent size of one of the files is 900GB, while the actual size of the file is 434GB. While copying the file with cp (which properly supports copying sparse files), progress reports the size of the file as the apparent size rather than the actual size.
Is proper support for sparse files on the roadmap at all?
This is a good question. I have to see how procfs handles sparse files in fdinfo "pos" field, but I'm pretty sure that it will require to create a "map" of the file allocations and holes in order to compute a real percentage of the copy.
I don't see the point of showing other size than the actual size. cp will copy 434GB and not 900GB, so it's the normal behaviour to showing the progress based on 434GB. What you call the apparent size has no meaning out of the vmdk format.
Even if we code something to now about the "apparent" size, how did you process the computation, because you only copy real byte, the gap between 900GB and 434GB is virtual, there is no data. So you can't copy them, it's a non sense to want a progress on copying non existing data and it's impossible to compute.
It's the other way round, @BestPig :) Progress is currently reporting the apparent size.
How is it possible ? I don't understood how cv can know the apparent size (900GB), for cv and the file system, a sparse file is a regular file, there is no difference compared any other file.
The point of sparse files is to simulate a big file while actually allocating only fragments of it. If you ask the system the size of the file (stat(2)), it reports the "simulation" (900GB). That's what progress (and most other commands) is using.
So the sparse file is a feature of the file system and not handle by the vmdk format ?
@BestPig correct. These files were created by VMware, but they are being stored on an XFS filesystem. If you run "ls -lh" on the file, it reports 900G as the file size. If you run "du -h" on the file, it reports 434G; hence, it's a sparse file.