nixcrpkgs icon indicating copy to clipboard operation
nixcrpkgs copied to clipboard

Valgrind does not understand musl's malloc?

Open DavidEGrayson opened this issue 7 years ago • 7 comments

When I run valgrind on a program compiled for i686 Linux with Nixcrpkgs, I get lots of errors that look like this:

==12155== Conditional jump or move depends on uninitialised value(s)
==12155==    at 0x8157EB4: __malloc0 (in /home/david/tic/buildn/ticcmd)
==12155==    by 0x807DE9C: udevw_create_context (in /home/david/tic/buildn/ticcmd)
==12155==    by 0x807CCE5: libusbp_list_connected_devices (in /home/david/tic/buildn/ticcmd)
==12155==    by 0x805417A: tic_list_connected_devices (tic_device.c:33)
==12155==    by 0x804D716: tic::list_connected_devices() (tic.hpp:562)
==12155==    by 0x804E0A9: device_selector::list_devices() (device_selector.h:23)
==12155==    by 0x804A1EB: print_list(device_selector&) (cli.cpp:456)
==12155==    by 0x804B0F0: run(arguments const&) (cli.cpp:608)
==12155==    by 0x804B874: main (cli.cpp:749)
==12155==  Uninitialised value was created
==12155==    at 0x8156669: ??? (in /home/david/tic/buildn/ticcmd)

The command I am using is:

valgrind --track-origins=yes --log-fd=1 ./ticcmd --list | less

I think these errors are coming from musl and I should try compiling it with debug symbols enabled to get more information.

DavidEGrayson avatar Sep 01 '17 22:09 DavidEGrayson

It seems like Valgrind doesn't really understand musl's malloc. It fails to detect the memory leak in a trivial program.

$ cat test.c
#include <stdlib.h>

int main()
{
  malloc(20);
  return 0;
}

$ ./result/bin/i686-linux-musleabi-gcc ./test.c -g -o test

$ valgrind ./test 
==12394== Memcheck, a memory error detector
==12394== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==12394== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==12394== Command: ./test
==12394== 
==12394== 
==12394== HEAP SUMMARY:
==12394==     in use at exit: 0 bytes in 0 blocks
==12394==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==12394== 
==12394== All heap blocks were freed -- no leaks are possible
==12394== 
==12394== For counts of detected and suppressed errors, rerun with: -v
==12394== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I think the root cause might be that it simply doesn't understand musl's malloc, so it's raising a bunch of errors when it thinks that malloc is doing invalid things, and it is not recognizing the pieces of memory allocated by that malloc as leaks at the end of the program.

DavidEGrayson avatar Sep 01 '17 22:09 DavidEGrayson

Interesting. Don't have a solution but some thoughts and tidbits that might be useful:

  1. The malloc(20) testcase reports the leak for me when linking against musl dynamiically, but not statically, so if you aren't doing so and that's an option you might try that. Related-ish mailing list thread.

  2. While valgrind detects the leak when using dynamically linked musl, it complains about invalid free/delete/delete[]/realloc. Probably fixable with the right suppression bits (as are used for glibc, as I understand it).

  3. When using valgrind tools like "massif" the option --pages-as-heap gets it working when running on static binaries.

dtzWill avatar Sep 02 '17 13:09 dtzWill

Hey, sorry, I haven't gotten around to looking into these suggestions. It's nice to have someone with your caliber helping out though!

DavidEGrayson avatar Oct 27 '17 20:10 DavidEGrayson

I will have a look at compiling statically and then running Valgrind. Experiencing the same issue when I run make check on https://github.com/ElementsProject/lightning under Alpine Linux (musl-based distro).

jsarenik avatar Feb 27 '18 19:02 jsarenik

Hi @jsarenik. In case it wasn't clear, I was compiling everything statically and encountering the error. From what Will is saying, it sounds like Valgrind might work better if Musl is linked dynamically instead, but I have not tried that.

DavidEGrayson avatar Feb 27 '18 20:02 DavidEGrayson

@DavidEGrayson - thank you. Yes, I misunderstood this bit.

jsarenik avatar Feb 28 '18 11:02 jsarenik

It seems like Valgrind doesn't really understand musl's malloc. It fails to detect the memory leak in a trivial program.

$ cat test.c
#include <stdlib.h>

int main()
{
  malloc(20);
  return 0;
}

$ ./result/bin/i686-linux-musleabi-gcc ./test.c -g -o test

I think it is more likely that your C compiler is optimizing out the unused call to malloc(). (I see the same behavior with Valgrind on a normal Linux/glibc computer.) What happens if you add -O0 to the arguments for gcc?

calebsander avatar Sep 13 '20 22:09 calebsander