valgrind icon indicating copy to clipboard operation
valgrind copied to clipboard

Merging PMDebugger into Pmemcheck

Open dibang2008 opened this issue 4 years ago • 11 comments

This change is Reviewable

dibang2008 avatar May 11 '21 09:05 dibang2008

@krzycz this is context of https://dl.acm.org/doi/abs/10.1145/3445814.3446744?sid=SCITRUS @dibang2008 has agreed to upstream those changes back. But yes, I agree with your points. But instead of squashing the commits, I would recommend simply improving the commit message to accurately reflect the why and how of the change.

pbalcer avatar May 11 '21 11:05 pbalcer

I have done following things according to your comments:

  1. format my codes according to existing codes;
  2. add a document (named PMDebuggerToPmemcheck.md) to simply illustrate what has changed and why;
  3. move is_delete to the end of the structure;
  4. change 100000000UL to 1000UL to save memory.

For unit tests, I have used the existing tests in Pmemcheck to verify my code (perl tests/vg_regtest pmemcheck) and I am not sure that is suitable for you. If you want more comprehensive unit tests, please give me more details (I am not familiar with this area).

Thanks

dibang2008 avatar May 12 '21 08:05 dibang2008

OK, I will try to solve these problems next week, because I have to prepare my dissertation oral defense this week

dibang2008 avatar May 19 '21 02:05 dibang2008

No worries, take your time. Good luck with your dissertation!

pbalcer avatar May 19 '21 06:05 pbalcer

Thanks for cleaning up this code. Looks much nicer now, but there are still some stylistic changes that make it a bit hard to read. I've pointed out a few in my comments.

I'm still a bit worried that this change is going to make pmemcheck a bit harder to maintain - so I'll try to think of a way to mitigate that.

pbalcer avatar Jun 01 '21 13:06 pbalcer

I have fixed these issues. Thanks for your valuable suggestions!

dibang2008 avatar Jun 05 '21 02:06 dibang2008

Yes, your understanding is correct and we can make our structures more obvious. But there is one problem: a dynamic structure (such as Oset in Valgrind) may introduce extra overheads because a dynamic structure requires reallocations, deletions, and rebalancing. These overheads can be very large as the number of instructions increase.

We can try a dynamic structure but we need to find a more lightweight structure supported by Valgrind (The Oset structure is not a good choice). Any suggestion? Thanks.

dibang2008 avatar Jun 21 '21 07:06 dibang2008

Well, you are already using a reallocating array - so my suggestion would be to wrap it in a small abstraction called "Ovec" or something like that. EDIT: This already exists in valgrind and is called XArray.

Btw, we've run regression testing over PMDK using valgrind from this patch and many of our tests have failed with this error:

[line 55881] obj_basic_integration/TEST1 pmemcheck1.log valgrind: m_mallocfree.c:305 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
[line 55882] obj_basic_integration/TEST1 pmemcheck1.log valgrind: Heap block lo/hi size mismatch: lo = 48032, hi = 8.
[line 55883] obj_basic_integration/TEST1 pmemcheck1.log This is probably caused by your program erroneously writing past the
[line 55884] obj_basic_integration/TEST1 pmemcheck1.log end of a heap block and corrupting heap metadata.  If you fix any
[line 55885] obj_basic_integration/TEST1 pmemcheck1.log invalid writes reported by Memcheck, this assertion failure will
[line 55886] obj_basic_integration/TEST1 pmemcheck1.log probably go away.  Please try that before reporting this as a bug.
[line 55887] obj_basic_integration/TEST1 pmemcheck1.log
[line 55888] obj_basic_integration/TEST1 pmemcheck1.log
[line 55889] obj_basic_integration/TEST1 pmemcheck1.log host stacktrace:
[line 55890] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    at 0x58008F28: show_sched_status_wrk (in /usr/local/lib/valgrind/pmemcheck-amd64-linux)
[line 55891] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x58009037: report_and_quit (in /usr/local/lib/valgrind/pmemcheck-amd64-linux)
[line 55892] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x580091BE: vgPlain_assert_fail (in /usr/local/lib/valgrind/pmemcheck-amd64-linux)
[line 55893] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x5800FA01: blockSane (in /usr/local/lib/valgrind/pmemcheck-amd64-linux)
[line 55894] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x580136A6: vgPlain_arena_realloc (in /usr/local/lib/valgrind/pmemcheck-amd64-linux)
[line 55895] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x580018C3: pInfo_AllocNodeInArray (in /usr/local/lib/valgrind/pmemcheck-amd64-linux)
[line 55896] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x58002954: trace_pmem_store (in /usr/local/lib/valgrind/pmemcheck-amd64-linux)
[line 55897] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x1002A1D107: ???
[line 55898] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0x100288DF2F: ???
[line 55899] obj_basic_integration/TEST1 pmemcheck1.log ==1747976==    by 0xB7DA: ???

Can you reproduce this issue on your end?

pbalcer avatar Jun 29 '21 12:06 pbalcer

Can you send me the regression testing? It can help me reproduce this bug. I am not sure if the regression testing can be open-source. If not, can you give some hints of input data? I will try to reproduce this bug. Thanks :).

dibang2008 avatar Jul 02 '21 01:07 dibang2008

We've simply ran the entire PMDK test suite under this version of pmemcheck. The specific output example was from obj_basic_integration TEST1. You can run it by going into the directory of that test (src/test/obj_basic_integration) and then ./TESTS.py -u 1.

pbalcer avatar Jul 02 '21 11:07 pbalcer

I also have a bug for this test but it is not the same as yours. It is as follow:

obj_basic_integration/TEST1: SETUP	(medium/pmem/debug/pmemcheck)

Error 1
Last 0 lines of /home/aim/bangdi/pmdk/src/test/obj_basic_integration/err1.log below (whole file has 0 lines):
Last 2 lines of /home/aim/bangdi/pmdk/src/test/obj_basic_integration/out1.log below (whole file has 2 lines):
obj_basic_integration/TEST1: START: obj_basic_integration
 /home/aim/bangdi/pmdk/src/test/obj_basic_integration/obj_basic_integration /mnt/dbpmemfs/obj_basic_integration_1/testfile1
Last 2 lines of /home/aim/bangdi/pmdk/src/test/obj_basic_integration/trace1.log below (whole file has 2 lines):
{obj_basic_integration.c:654 main} obj_basic_integration/TEST1: START: obj_basic_integration
 /home/aim/bangdi/pmdk/src/test/obj_basic_integration/obj_basic_integration /mnt/dbpmemfs/obj_basic_integration_1/testfile1
Last 30 lines of /home/aim/bangdi/pmdk/src/test/obj_basic_integration/pmemcheck1.log below (whole file has 54 lines):
==6698==    by 0x1002A8DF17: ???
==6698==    by 0x1002A8DF2F: ???
==6698==    by 0x1002A8DF3F: ???

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 6698)
==6698==    at 0x4C81A94: ulog_construct (ulog.c:178)
==6698==    by 0x4C5DA64: lane_init_data (lane.c:345)
==6698==    by 0x4C6A7CE: obj_descr_create (obj.c:905)
==6698==    by 0x4C6C3BD: pmemobj_createU (obj.c:1411)
==6698==    by 0x4C6C57F: pmemobj_create (obj.c:1452)
==6698==    by 0x10FC27: main (obj_basic_integration.c:666)
client stack range: [0x1FFEFFB000 0x1FFF000FFF] client SP: 0x1FFEFFFA30
valgrind stack range: [0x100298E000 0x1002A8DFFF] top usage: 7088 of 1048576


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.

This bug has been fixed by aligning struct arr_md in pmc_include.h. You can try my new commit. If bugs still exist, I think your testconfig.py may help us reproduce your bugs.

dibang2008 avatar Jul 06 '21 03:07 dibang2008