nydus
nydus copied to clipboard
Speed up Nydus-image create
For a standard Linux kernel source, nydus-image create now takes 8s by taking the default configuration.
There are 65119 files in this kernel source, and
# 1st run
$time ./target-fusedev/release/nydus-image create -B liubo/src.bootstrap -D liubo liubo/kernel_src
[2022-05-19 02:08:28.078232 +08:00] INFO [rafs/src/metadata/md_v5.rs:28] rafs superblock features: COMPRESS_LZ4_BLOCK | DIGESTER_BLAKE3 | EXPLICIT_UID_GID
[2022-05-19 02:08:28.340749 +08:00] INFO [src/bin/nydus-image/main.rs:622] build successfully: BuildOutput { artifacts: [BuildOutputArtifact { bootstrap_name: "", blobs: [] }], blobs: ["95922bb8a3ed6bd320f1c8376c26197ef2107cb8ce6d9094a3a17da0dbe44810"], last_blob_size: Some(301576727), last_bootstrap_name: "" }
real 0m8.099s
user 0m6.667s
sys 0m1.345s
# 2nd run
real 0m7.999s
user 0m6.682s
sys 0m1.315s
# 3rd run
real 0m7.970s
user 0m6.702s
sys 0m1.267s
The above process involves
- iterating directories recursively
- reading, compressing and writing data to the final blob
- writing bootstrap
Looks like the bottleneck is not about IO since I've tried running the whole process in temps, which only gave 5% improvement.
IMO, there is some space that we can make it faster, the ideal goal is within 1s.
We can use trace to analyze where are the main time consumption, and then make optimization, the points that I can think of so far:
- concurrently do blake3/sha256 hash calculation and lz4/zstd chunk compression, and whether we can concurrently write blob data?
- do not generate a tree structure for a single-layer to reduce the overhead of traversing nodes.
- disable rafs format and digest validation for bootstrap checking and parent bootstrap loading.
- use cached mode instead direct mode to load bootstrap on merging operation.
- improve tree.apply performance.
@yawqi would offer some help on profiling converting images with nydus-image by flamegraph, many thanks!
@jiangliu what do you think?
This is a simple flamegraph I generate with flamegraph-rs and following commands, other use cases need to be further tested.
flamegraph -o ./test.svg -- ./nydus-image create -B blobs-v6/wq.bootstrap -D blobs-v6 -v 6 linux-6.0
As per the discussion offline, nydus-image spends most time on sha256 compression, which is good and matches with our expectation.
Many thanks for the efforts, @yawqi. Can you please also do another flamegraph run with a relatively huge parent bootstrap to see the typical bootstrap-loading cost?
flamegraph -o ./parent.svg -- ./nydus-image create -B blobs-v6/wq.bootstrap -D blobs-v6 -v 6 ../workplace/github.com/yawqi/image-service
flamegraph -o ./son.svg -- ./nydus-image create --parent-bootstrap blobs-v6/wq.bootstrap -B blobs-v6-son/wq.bootstrap -D blobs-v6 -v 6 ./linux-6.0
I am not sure whether I am doing it the right way. The first flamegraph is building the parent bootstrap, and the second flamegraph is building a bootstrap whose parent is previous bootstrap.
The nydus-image's version is v2.1.1. The size of the parent bootstrap is 2.2MB, and the son bootstrap is 18MB.
It seems we can disable bootstrap/digest validation first, and then improve the speed of tree.apply.
I conduct the same operations with master release build. To be noticed, the master use zstd and sha256 as default, while the v2.1.1 use lz4 as default.
flamegraph -o ./parent-master.svg -- ./nydus-image-master create -B blobs-v6-master/wq.bootstrap -D blobs-v6-master -v 6 ../workplace/github.com/yawqi/image-service
flamegraph -o ./child-master.svg -- ./nydus-image-master create --parent-bootstrap blobs-v6-master/wq.bootstrap -B blobs-v6-master/wq-child.bootstrap -D blobs-v6-master -v 6 ./linux-6.0
Here is the following results, the upper one is parent, the lower one is child:
The size of parent's source(./workplace/github.com/yawqi/image-service) is 4.2G, the size of child's source(linux-6.0) is 1.4GB.

The time consumed by lz4+blake3 is much faster than zstd+sha256 when creating my nydus image of linux-6.0 repo.
zstd+blake3:

lz4+sha256:

The time consumed by lz4+blake3 is much faster than zstd+sha256 when creating my nydus image of my nydus repo.
My not so accurate tests show that zstd+sha256 is about 3x times slower than lz4+blake3.
zstd+blake3:
lz4+sha256:

@yawqi Thanks for the test, how about zstd+blake3?
The summary for your test @yawqi:
| compressor | digester | build time | blob size |
|---|---|---|---|
| zstd | sha256 | 3m40.511s | 1.08GB |
| zstd | blake3 | 1m59.593s | 1.08GB |
| lz4 | sha256 | 1m37.090s | 1.63GB |
| lz4 | blake3 | 1m1.971s | 1.63GB |
More benchmark results about nydus-image create based on the master branch:
