genext2fs icon indicating copy to clipboard operation
genext2fs copied to clipboard

genext2fs is painfully slow for multi-GB input

Open josch opened this issue 2 years ago • 8 comments

Hi,

I'm now using genext2fs with multi-GB tarballs as input. While this works well it also takes several hours on my machine. So I profiled genext2fs: gprof.txt

If I read the profiling output correctly, then most time is spent in the function allocate().

Do you have any ideas how to improve the speed by introducing better data structures?

josch avatar Feb 07 '22 22:02 josch

Hi @josch,

indeed I have some ideas to mitigate this; I'm currently a bit short on time but I may try something.
Do you have an easy way of reproducing the problem ?

bestouff avatar Feb 08 '22 08:02 bestouff

The "easy" way is just to throw a big tarball at it. :smile:

For example here is a big system image: https://mister-muffin.de/reform/target-userland-full.tar

josch avatar Feb 08 '22 09:02 josch

Any luck looking into this?

I've hit this issue as well. For me with a ~10gb tar it seems to basically never complete (on a very powerful machine). vs e.g. virt-make-fs taking ~30 min.

gelrom avatar Jan 12 '23 02:01 gelrom

Some quick benchmarks that I did make me think there is something highly nonlinear going on: 100mb ~1s 500mb ~10s 800mb ~27s 900mb ~71s 1gb ~130s

note: these were done with a tar of a single file of the above sizes.

gelrom avatar Jan 12 '23 02:01 gelrom

I observed the same non-linear behavior. Since this is breaking my use-case for genext2fs I instead worked on a patch for e2fsprogs that would allow it to use a tarball as input: https://github.com/tytso/e2fsprogs/pull/118

josch avatar Jan 12 '23 09:01 josch

I'm trying to build a 8G image using genimage and genext2fs -d ... has been running for at least 30 minutes. I haven't managed to get it to finish yet. I tried using -a rootfs.tar and ran into a locale issue with a downloaded tar and a segfault on a tar I created.

pamolloy avatar Mar 31 '23 14:03 pamolloy

Switched to mke2fs using use-mke2fs = "true" in my genimage.cfg, which seems to perform without issue and complete in less than a minute

pamolloy avatar Mar 31 '23 14:03 pamolloy

@pamolloy did the local issue look something like this:

archive_read_next_header(): Pathname can't be converted from UTF-8 to current locale.

If yes, maybe try out https://github.com/bestouff/genext2fs/pull/30 and tell me if that fixes your issue?

As for the slowness, I do not know how to fix genext2fs but if you want tarball input, then maybe https://github.com/tytso/e2fsprogs/pull/118 is of interest to you?

josch avatar Mar 31 '23 14:03 josch