horaedb Should optimize the cargo build time

I've run cargo build --release -Ztimings and here is what I get:

Top 10

Unit	Total	Codegen	Features
libgit2-sys v0.13.2+1.4.2 build script (run)	435.1s
grpcio-sys v0.9.1+1.38.0 build script (run)	253.9s		bindgen, boringssl-src, openssl, secure, use-bindgen
librocksdb_sys v0.1.0 build script (run)	227.5s		default, portable
zstd-sys v2.0.1+zstd.1.5.2 build script (run)	168.7s		default, legacy, std, zdict_builder
datafusion v9.0.0	162.9s	156.1s (96%)	crypto_expressions, default, regex_expressions, unicode_expressions
datafusion-physical-expr v9.0.0	160.1s	152.0s (95%)	blake2, blake3, crypto_expressions, default, md-5, regex, regex_expressions, sha2, unicode-segmentation, unicode_expressions
parquet v15.0.0	126.0s	122.0s (97%)	arrow, base64, brotli, default, flate2, lz4, snap, zstd
arrow v15.0.0	124.4s	116.9s (94%)	comfy-table, csv, csv_crate, default, flatbuffers, ipc, prettyprint, rand, test_utils
libz-sys v1.1.8 build script (run)	92.4s		default, libc, static, stock-zlib
bzip2-sys v0.1.11+1.0.8 build script (run)	84.6s		static

Summary Targets: ceresdb 0.1.0 (lib, bin "ceresdb-server") Profile: release Fresh units: 273 Dirty units: 459 Total units: 732 Max concurrency: 10 (jobs=6 ncpu=6) Build start: 2022-07-26T07:56:01Z Total time: 607.2s (10m 7.2s) rustc: rustc 1.59.0-nightly (f1ce0e6a0 2022-01-05) Host: x86_64-unknown-linux-gnu Target: x86_64-unknown-linux-gnu Max (global) rustc threads concurrency: 0

In release building itself I think we can do the following to accelerate it:

inspect whether we need libgit2. I know a use case is to fetch commit ID as version info. Maybe we can let CI provide this kind of info.
Condition compiling rocksdb. Rocksdb is used as one of the WAL implementation in CeresDB. In the future it should be a kind of optional debugging dep rather than a requirement.
Use GitHub Action cache. Theoretically, we can perform incremental compilation for most scenarios. I've configured cache in CI but it seems like it's not working. There is an issue with this topic https://github.com/CeresDB/ceresdb/issues/5. I haven't dug into it.

Originally posted by @waynexia in https://github.com/CeresDB/ceresdb/issues/148#issuecomment-1195177300

Jul 26 '22 15:07 zwpaper

Additional notes: I think the third one (leverage the cache) is the most feasible and can bring a large improvement (in theory) without touching the code.

Jul 26 '22 16:07 waynexia

Condition compiling rocksdb. Rocksdb is used as one of the WAL implementation in CeresDB. In the future it should be a kind of optional debugging dep rather than a requirement.

I think it's time to achieve this. For test purpose we have memory WAL implementation, RocksDB is not a requirement anymore. I filed https://github.com/CeresDB/ceresdb/issues/206 for this

Aug 18 '22 08:08 waynexia

We have replaced the grpcio with tonic,grpcio-sys is unnecessary to be built now. We will keep optimzing the build time in later.

Mar 27 '23 02:03 Rachelint

I guess most actions that can be taken has been done.

Nov 03 '23 06:11 ShiKaiWi

horaedb horaedb copied to clipboard

Should optimize the cargo build time

horaedb
horaedb copied to clipboard