texture-synthesis
texture-synthesis copied to clipboard
Investigate musl performance issues
There is a huge performance gap between x86_64-unknown-linux-musl
and x86_64-unknown-linux-gnu
that needs to be investigated, as we use musl for our pre-built linux binaries.
Doing the same CLI operation shows how severe this performance gap is:
time target/x86_64-unknown-linux-musl/release/texture-synthesis --out out/04-2.png --no-progress transfer-style --style imgs/multiexample/4.jpg --guide imgs/tom.jpg
543.92s user
356.85s system
1808% cpu
49.812 total
time target/release/texture-synthesis --out out/04-2.png --no-progress transfer-style --style imgs/multiexample/4.jpg --guide imgs/tom.jpg
131.32s user
0.25s system
1605% cpu
8.193 total
wow
Interesting. Iām not much help here but subscribing because I was just about to migrate our Rust services to musl...
So, replacing the system allocator with jemalloc basically returns musl performance to glibc, but this doesn't seem right as a "fix" as it would be sweeping a problem under the rug. The root of the problem would seem to be heap allocations occuring inside many different threads, which musl seems to not handle well at all. Reducing/removing/pre-allocating memory will probably improve performance for all targets, not just musl, so will investigate that later.
musl - master
:
musl - jemalloc
:
Also might look into mimalloc if it still helps performance after investigating the possibility of reducing allocations. There are a couple of Rust global allocators for Mimalloc we could try.
Just curious, is there an open issue about this over at the Musl tracker? Pretty confident they would love to know about this š This is an inexcusable performance hit.
EDIT: TIL Musl doesn't have a dedicated tracker, it seems. They have a mailing list described on their site. Might be worth it for a maintainer to direct them toward this issue.
is this issue still a thing?
running time target/x86_64-unknown-linux-musl/release/texture-synthesis --out out/04-2.png --no-progress transfer-style --style imgs/multiexample/4.jpg --guide imgs/tom.jpg
on master gives ~44 and ~42 seconds for musl/glibc correspondingly, and flamegraph of musl version doesnt look that bad on malloc
Good question, I haven't checked in a long time, thanks for the heads up!