Added mem-dbg across all structs of the crate as optional feature
Mem-dbg is a crate that allows to compute the size of a struct. I have added the derives through the crate as an optional feature, so as to use it to compare this implementation with others easily.
Cheers!
@LucaCappelletti94 thanks for adding optional mem-dbg feature.
We'd love to have this change in, however it seems few size calculation inconsistencies should be addressed first.
For example, if I apply the following diff:
diff --git a/src/estimator.rs b/src/estimator.rs
index 0c7c7fb..cca6bc0 100644
--- a/src/estimator.rs
+++ b/src/estimator.rs
@@ -181,7 +181,9 @@ where
H: Hasher + Default,
{
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
- write!(f, "{:?}", self.representation())
+ use mem_dbg::{MemSize, SizeFlags};
+
+ write!(f, "{:?} mem_dbg = {:?}", self.representation(), self.mem_size(SizeFlags::default()))
}
}
and then run tests with:
cargo test --features mem_dbg test_estimator_p12_w6
I can see quite a few unexplained discrepancies between mem_dbg and the actual size:
---- estimator::tests::test_estimator_p12_w6::_0_expects_representation_small_estimate_0_size_8_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_0_expects_representation_small_estimate_0_size_8_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Small(estimate: 0, size: 8), avg_err: 0.0000"
right: "representation: Small(estimate: 0, size: 8) mem_dbg = 40, avg_err: 0.0000"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- estimator::tests::test_estimator_p12_w6::_128_expects_representation_array_estimate_128_size_520_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_128_expects_representation_array_estimate_128_size_520_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Array(estimate: 128, size: 520), avg_err: 0.0000"
right: "representation: Array(estimate: 128, size: 520) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_1_expects_representation_small_estimate_1_size_8_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_1_expects_representation_small_estimate_1_size_8_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Small(estimate: 1, size: 8), avg_err: 0.0000"
right: "representation: Small(estimate: 1, size: 8) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_16_expects_representation_array_estimate_16_size_72_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_16_expects_representation_array_estimate_16_size_72_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Array(estimate: 16, size: 72), avg_err: 0.0000"
right: "representation: Array(estimate: 16, size: 72) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_1024_expects_representation_hll_estimate_1012_size_3092_avg_err_0_0130_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_1024_expects_representation_hll_estimate_1012_size_3092_avg_err_0_0130_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Hll(estimate: 1012, size: 3092), avg_err: 0.0130"
right: "representation: Hll(estimate: 1012, size: 3092) mem_dbg = 40, avg_err: 0.0130"
---- estimator::tests::test_estimator_p12_w6::_129_expects_representation_hll_estimate_130_size_3092_avg_err_0_0001_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_129_expects_representation_hll_estimate_130_size_3092_avg_err_0_0001_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Hll(estimate: 130, size: 3092), avg_err: 0.0001"
right: "representation: Hll(estimate: 130, size: 3092) mem_dbg = 40, avg_err: 0.0001"
---- estimator::tests::test_estimator_p12_w6::_256_expects_representation_hll_estimate_254_size_3092_avg_err_0_0029_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_256_expects_representation_hll_estimate_254_size_3092_avg_err_0_0029_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Hll(estimate: 254, size: 3092), avg_err: 0.0029"
right: "representation: Hll(estimate: 254, size: 3092) mem_dbg = 40, avg_err: 0.0029"
---- estimator::tests::test_estimator_p12_w6::_2_expects_representation_small_estimate_2_size_8_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_2_expects_representation_small_estimate_2_size_8_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Small(estimate: 2, size: 8), avg_err: 0.0000"
right: "representation: Small(estimate: 2, size: 8) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_3_expects_representation_array_estimate_3_size_24_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_3_expects_representation_array_estimate_3_size_24_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Array(estimate: 3, size: 24), avg_err: 0.0000"
right: "representation: Array(estimate: 3, size: 24) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_32_expects_representation_array_estimate_32_size_136_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_32_expects_representation_array_estimate_32_size_136_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Array(estimate: 32, size: 136), avg_err: 0.0000"
right: "representation: Array(estimate: 32, size: 136) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_4_expects_representation_array_estimate_4_size_24_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_4_expects_representation_array_estimate_4_size_24_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Array(estimate: 4, size: 24), avg_err: 0.0000"
right: "representation: Array(estimate: 4, size: 24) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_8_expects_representation_array_estimate_8_size_40_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_8_expects_representation_array_estimate_8_size_40_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Array(estimate: 8, size: 40), avg_err: 0.0000"
right: "representation: Array(estimate: 8, size: 40) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_64_expects_representation_array_estimate_64_size_264_avg_err_0_0000_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_64_expects_representation_array_estimate_64_size_264_avg_err_0_0000_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Array(estimate: 64, size: 264), avg_err: 0.0000"
right: "representation: Array(estimate: 64, size: 264) mem_dbg = 40, avg_err: 0.0000"
---- estimator::tests::test_estimator_p12_w6::_512_expects_representation_hll_estimate_498_size_3092_avg_err_0_0068_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_512_expects_representation_hll_estimate_498_size_3092_avg_err_0_0068_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Hll(estimate: 498, size: 3092), avg_err: 0.0068"
right: "representation: Hll(estimate: 498, size: 3092) mem_dbg = 40, avg_err: 0.0068"
---- estimator::tests::test_estimator_p12_w6::_4096_expects_representation_hll_estimate_4105_size_3092_avg_err_0_0089_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_4096_expects_representation_hll_estimate_4105_size_3092_avg_err_0_0089_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Hll(estimate: 4105, size: 3092), avg_err: 0.0089"
right: "representation: Hll(estimate: 4105, size: 3092) mem_dbg = 40, avg_err: 0.0089"
---- estimator::tests::test_estimator_p12_w6::_10_000_expects_representation_hll_estimate_10068_size_3092_avg_err_0_0087_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_10_000_expects_representation_hll_estimate_10068_size_3092_avg_err_0_0087_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Hll(estimate: 10068, size: 3092), avg_err: 0.0087"
right: "representation: Hll(estimate: 10068, size: 3092) mem_dbg = 40, avg_err: 0.0087"
---- estimator::tests::test_estimator_p12_w6::_100_000_expects_representation_hll_estimate_95628_size_3092_avg_err_0_0182_ stdout ----
thread 'estimator::tests::test_estimator_p12_w6::_100_000_expects_representation_hll_estimate_95628_size_3092_avg_err_0_0182_' panicked at src/estimator.rs:237:5:
assertion `left == right` failed
left: "representation: Hll(estimate: 95628, size: 3092), avg_err: 0.0182"
right: "representation: Hll(estimate: 95628, size: 3092) mem_dbg = 40, avg_err: 0.0182"
Perhaps, cardinality-estimator crate logic should be adjusted on how mem_size is computed or something inside mem-dbg crate should be changed.
Basically, in most cases the derive is enough to cover everything, and honestly when I opened the PR I had just done that thinking it should be it. Afterwards, as I started benchmarking, I realized that due to the use of the pointer trick to handle dynamic size, the derive told us the struct was only 8 bits or so. Therefore, I quickly wrote up the traits implementations, but I only covered the inner representation. I will try and fix it shortly.
Small update: fixed errors estimating size on your crate side, but also identified an error on the mem dbg side. Working on that now.
Now if you rerun the same script as above, you will find that all estimates are matching.