e2fsprogs
e2fsprogs copied to clipboard
mke2fs: the -d option can now handle tarball input
This is an independent implementation that does the same thing as #107. Unfortunately I wasn't aware that e2fsprogs had a github repository (http://e2fsprogs.sourceforge.net/ only mentions the kernel.org repo). Differences of this approach compared to #107 are:
- no hard dependency on libarchive via some
#ifdefs - re-uses the
-doption instead of adding a new option - supports reading the archive from stdin using
- - comes with several test cases checking that
- a roundtrip from tar to ext4 to tar produces bit-by-bit identical results
- extended attributes work
- gnu tar and pax work
- character devices and symlinks work
Just as @russdill I'm looking for comments on this. Thanks!
Use dlopen instead of linking against libarchive to keep runtime dependencies minimal as requested in https://github.com/tytso/e2fsprogs/pull/107#issuecomment-1193042420
Hi, @tytso, could you give some feedback for this pull request or for #107 so that I can work on fixing remaining issues? Thanks!
Sorry for the delay in looking at this pull request. Things have been pretty busy. So some comments about this change.
First of all, when I tried doing a trial merge, I got test failures for the m_rootgnutar test. There are a lot of failures of the form:
dump_file: Operation not permitted while changing ownership of test.img.dir/progs/hold_inode.c dump_file: Operation not permitted while changing ownership of test.img.dir/progs/random_exercise.c ....
Secondly, please test to make sure that (a) e2fsprogs builds with and without libarchive installed. One of the really nasty things about configure scripts is that very often, people are careless about them, and the autoconf scripts slow down the build process (since you have to run configure script, which takes time), but it doesn't buy you anything in terms of portability if the build crashes and burns if the build environment does not exactly match the developer's. If it's going to be non-portable, why waste time with the configure script? Secondly, after fixing the configure script, we then find that "make check" fails all of the three new tests, since m_rootgnutar, m_rootpaxtar, and m_roottar are written assuming that the libarchive functionality is present.
While I do appreciate adding the dlopen() support, one question that comes to mind is how stable is the libarchive ABI. Given that the the SOVERSION is up to 13, it causes me to wonder whether it's like openssl, where the ABI is so badly designed that a SOBUMP is needed at essentially every single release. Also, the man page needs to warn that a particular bit of functionality might not be present --- either because at compile-time, the archive.h header file isn't there, in which case the functionality will never be present, or at run-time, because the shared library for libarchive isn't -present, in which case presumably we should get a nice warning message a file was passed to the -d option, but we can't handle it because libarchive isn't available --- and not a confusing error message such as:
__populate_fs: Not a directory while changing working directory to "test.img.tar"
Sorry for the delay in looking at this pull request. Things have been pretty busy.
No problem, same over here. :)
So some comments about this change.
Thank you for your review!
First of all, when I tried doing a trial merge, I got test failures for the m_rootgnutar test. There are a lot of failures of the form:
dump_file: Operation not permitted while changing ownership of test.img.dir/progs/hold_inode.c dump_file: Operation not permitted while changing ownership of test.img.dir/progs/random_exercise.c ....
This should be fixed now.
Secondly, please test to make sure that (a) e2fsprogs builds with and without libarchive installed. One of the really nasty things about configure scripts is that very often, people are careless about them, and the autoconf scripts slow down the build process (since you have to run configure script, which takes time), but it doesn't buy you anything in terms of portability if the build crashes and burns if the build environment does not exactly match the developer's. If it's going to be non-portable, why waste time with the configure script?
I verified that e2fsprog builds with and without libarchive installed.
Secondly, after fixing the configure script, we then find that "make check" fails all of the three new tests, since m_rootgnutar, m_rootpaxtar, and m_roottar are written assuming that the libarchive functionality is present.
Fixed now as well.
While I do appreciate adding the dlopen() support, one question that comes to mind is how stable is the libarchive ABI. Given that the the SOVERSION is up to 13, it causes me to wonder whether it's like openssl, where the ABI is so badly designed that a SOBUMP is needed at essentially every single release.
The last SOVERSION bump happened 10 years ago. I asked about the ABI stability here but got no reply yet: https://github.com/libarchive/libarchive/issues/1854
Also, the man page needs to warn that a particular bit of functionality might not be present --- either because at compile-time, the archive.h header file isn't there, in which case the functionality will never be present, or at run-time, because the shared library for libarchive isn't -present,
I added this information to the man page.
in which case presumably we should get a nice warning message a file was passed to the -d option, but we can't handle it because libarchive isn't available and not a confusing error message such as:
__populate_fs: Not a directory while changing working directory to "test.img.tar"
I was unable to trigger this error. Do you remember what you did to get it? In any case, I changed some of the code paths surrounding __populate_fs so this might not get generated anymore.
I'm submitted my patch to the linux-ext4 list to get more feedback: https://lore.kernel.org/linux-ext4/[email protected]/
While I do appreciate adding the dlopen() support, one question that comes to mind is how stable is the libarchive ABI. Given that the the SOVERSION is up to 13, it causes me to wonder whether it's like openssl, where the ABI is so badly designed that a SOBUMP is needed at essentially every single release.
In general I think dlopen()ing a specific shared library via its full SONAME from an external project where there is no tight coordination going on is the wrong approach because it's crossing an ABI boundary with no checks in place, which requires manual patching when the SONAME gets bumped even if the used symbols have not seen any ABI or ABI change, and the code will not automatically pick up ABI changes that are API compatible. I think a better solution is to create a shared object "module" or "plugin" exposing the functions that you need that links against the libarchive library as usual, which will mean the external project can always be used safely, or the compiled will just barf, and this module within the project boundaries can be safely dlopen()ed if present. Then when packaging this module can be split into its own package and installed if desired f.ex, which will also make it possible to automatically pick up the required dependencies w/o having to hardcode those as well.
FWIW, I really would love to see this feature merged.
Hello and happy new year! :slightly_smiling_face:
I just rebased this branch on top of master. If there is anything I can do to move this forward, @tytso, please do tell. Thank you!
The github workflow seems to fail for macos and android but the failures do not seem to be due to the changes of this merge request. In fact on android, the relevant tests are successfully skipped:
2024-01-09T11:58:40.7151560Z m_rootgnutar: create fs image from GNU tarball: skipped (no libarchive)
2024-01-09T11:58:40.7473800Z m_roottar: skipped (no libarchive)
2024-01-09T11:58:40.7475090Z m_rootpaxtar: skipped (no libarchive)