externalsortinginjava icon indicating copy to clipboard operation
externalsortinginjava copied to clipboard

StringSizeEstimator

Open j-joker opened this issue 8 years ago • 3 comments

do you consider the padding when you caculate the size of string ?the size of an object Is a multiple of 8

j-joker avatar Oct 04 '16 08:10 j-joker

@j-joker

This seems to be a valid issue. I think we should round-up to the word size (64 bits on a 64-bit machine and 32 bits on a 32-bit machine).

Care to issue a PR? This way you'd get credit for the (small) change.

lemire avatar Oct 06 '16 02:10 lemire

(I am repeating this comment from the PR to make sure it does not get lost:)

Please merge the following commit https://github.com/lemire/externalsortinginjava/commit/a5886f7e94b930b0cea260d26b41d412a28cc81c

and run mvn test before and after your change. It will measure in a rough but sufficiently accurate manner the running time of the string estimation.

We want to make sure that we do not degrade the performance since this function is called repeatedly, possibly millions of times. It also does something that is relatively unimportant (produce a memory usage estimation) so we do not ever want it to have an impact on performance.

Here is what it might look like...

$ mvn test
(...)
Running com.google.code.externalsorting.ExternalSortTest
#ignore = 67412000
[performance] String size estimator uses 1.116796875 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.120703125 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.1216796875 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.116796875 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.1138671875 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.116796875 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.1197265625 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.116796875 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.112890625 ns per string
#ignore = 67412000
[performance] String size estimator uses 1.116796875 ns per string

lemire avatar Oct 06 '16 14:10 lemire

This remains unresolved, we may underestimate the memory usage. Some analysis is needed.

lemire avatar Oct 31 '17 02:10 lemire