0 Initalized packed array buffer
I have 0 initialized the PackedArray buffer. I was valrgrind'ing around in my application and was getting Conditional jump or move depends on uninitialised value(s) when using a 1- bit packed array buffer and was getting incorrect results. This fix resolves the issue. I reran the tests and everything is passing. This is a fix for #6
Bug can be reproduced in this commit with this input aliquot/_pom_yang_rewrite/ make && valgrind ./main --bound=$((10**6)) --seg_len=$((5 * $((10**3)))) --writebuf_len=1000000 --preimage_count_bits=1 --num_threads=12
Hello @guinn8 👋
I likely won't accept this PR. At that point, it seems that all it does is working around a problem in your own code.
If you really want to 0 initialize the buffer, imho a proper way is to redefined PACKEDARRAY_MALLOC to use calloc() instead of malloc().
As far as Valgrind goes, I get no issue when running tests
$ make -C _gnu-make/
make: Entering directory '/data/Projects/PackedArray/_gnu-make'
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySelfTest -DPACKEDARRAY_SELF_TEST -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArray.c
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySelfBench -DPACKEDARRAY_SELF_BENCH -DNDEBUG -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArray.c
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySIMDSelfTest -DPACKEDARRAY_SELF_TEST -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArraySIMD.c
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySIMDSelfBench -DPACKEDARRAY_SELF_BENCH -DNDEBUG -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArraySIMD.c
make: Leaving directory '/data/Projects/PackedArray/_gnu-make'
$ valgrind ./bin/linux-x86_64/PackedArraySelfTest
==241616== Memcheck, a memory error detector
==241616== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==241616== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==241616== Command: ./bin/linux-x86_64/PackedArraySelfTest
==241616==
-- PackedArray self test -------------------------------------------------------
sizeof(PackedArray) = 16
1 by 1 packing / unpacking:
1 bits per item -- success.
2 bits per item -- success.
3 bits per item -- success.
4 bits per item -- success.
5 bits per item -- success.
6 bits per item -- success.
7 bits per item -- success.
8 bits per item -- success.
9 bits per item -- success.
10 bits per item -- success.
11 bits per item -- success.
12 bits per item -- success.
13 bits per item -- success.
14 bits per item -- success.
15 bits per item -- success.
16 bits per item -- success.
17 bits per item -- success.
18 bits per item -- success.
19 bits per item -- success.
20 bits per item -- success.
21 bits per item -- success.
22 bits per item -- success.
23 bits per item -- success.
24 bits per item -- success.
25 bits per item -- success.
26 bits per item -- success.
27 bits per item -- success.
28 bits per item -- success.
29 bits per item -- success.
30 bits per item -- success.
31 bits per item -- success.
32 bits per item -- success.
bulk packing / unpacking:
1 bits per item -- success.
2 bits per item -- success.
3 bits per item -- success.
4 bits per item -- success.
5 bits per item -- success.
6 bits per item -- success.
7 bits per item -- success.
8 bits per item -- success.
9 bits per item -- success.
10 bits per item -- success.
11 bits per item -- success.
12 bits per item -- success.
13 bits per item -- success.
14 bits per item -- success.
15 bits per item -- success.
16 bits per item -- success.
17 bits per item -- success.
18 bits per item -- success.
19 bits per item -- success.
20 bits per item -- success.
21 bits per item -- success.
22 bits per item -- success.
23 bits per item -- success.
24 bits per item -- success.
25 bits per item -- success.
26 bits per item -- success.
27 bits per item -- success.
28 bits per item -- success.
29 bits per item -- success.
30 bits per item -- success.
31 bits per item -- success.
32 bits per item -- success.
==241616==
==241616== HEAP SUMMARY:
==241616== in use at exit: 0 bytes in 0 blocks
==241616== total heap usage: 81,921 allocs, 81,921 frees, 73,788,928 bytes allocated
==241616==
==241616== All heap blocks were freed -- no leaks are possible
==241616==
==241616== For lists of detected and suppressed errors, rerun with: -s
==241616== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Fair enough to no accept the PR as is. Redefining PACKEDARRAY_MALLOC to use calloc is not entirely straight forward as you would need to handle the additional argument, that's why I went for memset. I suppose the issue is that I assumed PackedArray would be zero initialized as a complex data structure. Would you accept a PR that makes the readme and other docs very clear that memory is left uninitialized?
You are correct about it being a flaw in my application, my code assumed zero initialization. Looked like a bug from my perspective as the error was only observed on some inputs. Super cool library, reduced the memory consumption of my application by a factor of 8 (from 480gb to 60gb)!
Redefining
PACKEDARRAY_MALLOCto usecallocis not entirely straight forward as you would need to handle the additional argument, that's why I went formemset
You would do
#define PACKEDARRAY_MALLOC(size) calloc(1, size)
Also the benefit of calloc() is that can leverages the kernel filling pages with 0 depending on the OS. Which means it has the potentiel of being faster than malloc() then memset()