TAOS-CI icon indicating copy to clipboard operation
TAOS-CI copied to clipboard

Add val-grind as a PR-checker

Open kparichay opened this issue 6 years ago • 10 comments
trafficstars

Wanted to opinion about adding val-grind as a PR-checker. val-grind is known to have issues with false positives but glib maintains a suppression file to suppress such issues with their code. More suppression file can be taken as an input from the user as well.

If you think its a good idea, I will open a PR.

kparichay avatar Mar 21 '19 10:03 kparichay

:octocat: cibot: Thank you for posting issue #489. The person in charge will reply soon.

taos-ci avatar Mar 21 '19 10:03 taos-ci

:1st_place_medal: Please contribute the PR. :)

[FYI] How to create a new module:

  • https://github.com/nnsuite/TAOS-CI/wiki#demo-youtube-case-study
  • https://github.com/nnsuite/TAOS-CI/blob/master/ci/doc/how-to-use-taos-ci-module.md

leemgs avatar Mar 22 '19 01:03 leemgs

If you can create a machine-readable results with valgrind and results good enough to be enforced for every PR, it is great!

myungjoo avatar Mar 25 '19 01:03 myungjoo

Supporting valgrind without --leak-check=full seems feasible with some of the current test cases I tried (unittest_common, plugins, repo, sink and src_iio). However, running with valgrind exposes a deadlock issue inside gstreamer for both unittest_sink and unittest_src. The issues arises from the internals of gstreamer. For src_iio, the deadlock happens when transitioning to PAUSED state from READY state. The function gst_base_src_start_complete() is waiting for the lock to be released which is never released. The filtered debug log is:

0:00:15.393980575 18575      0x82a2c20 DEBUG             GST_STATES gstelement.c:2561:gst_element_set_state_func:<pipeline2> current READY, old_pending VOID_PENDING, next VOID_PENDING, old return SUCCESS
0:00:15.394381284 18575      0x82a2c20 DEBUG             GST_STATES gstbin.c:2664:gst_bin_change_state_func:<pipeline2> changing state of children from READY to PAUSED
0:00:15.400072879 18575      0x82a2c20 DEBUG             GST_STATES gstelement.c:2561:gst_element_set_state_func:<tensorsrciio4> current READY, old_pending VOID_PENDING, next VOID_PENDING, old return SUCCESS
0:00:15.400192833 18575      0x82a2c20 DEBUG             GST_STATES gstelement.c:2595:gst_element_set_state_func:<tensorsrciio4> final: setting state from READY to PAUSED
0:00:15.439533208 18575      0x88a82d0 DEBUG                basesrc gstbasesrc.c:523:gst_base_src_wait_playing:<tensorsrciio4> live source waiting for running state

More info from the gdb with valgrind -

(gdb) bt full
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00000000060e3e42 in __GI___pthread_mutex_lock (mutex=0xa7e7680) at ../nptl/pthread_mutex_lock.c:115
        id = 101622381
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x00000000066f42d3 in gst_base_src_start_complete () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#3  0x00000000066f4a1b in ?? () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#4  0x00000000066f4d68 in ?? () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#5  0x0000000005418537 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#6  0x0000000005418d0c in gst_pad_set_active () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#7  0x00000000053fadbd in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#8  0x000000000540b22c in gst_iterator_fold () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#9  0x00000000053fb21a in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#10 0x00000000053fcfaf in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#11 0x00000000053fd247 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#12 0x00000000066f2bbd in ?? () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#13 0x00000000053ff1ce in gst_element_change_state () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#14 0x00000000053ff947 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#15 0x00000000053de365 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#16 0x00000000053ff1ce in gst_element_change_state () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#17 0x00000000053ff947 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#18 0x0000000000412497 in test_tensor_src_iio_data_verify_no_trigger_bits32_alternate2_Test::TestBody (this=0xa7d2c00)
    at ../tests/nnstreamer_source/unittest_src_iio.cpp:1145
---Type <return> to continue, or q <return> to quit---
        dev0 = 0xa7d0140
        src_iio_pipeline = 0xa7fa8f0
        status = GST_STATE_CHANGE_SUCCESS
        state = GST_STATE_VOID_PENDING
        parse_launch = 0xa7e5690 "tensor_src_iio device=test-device-1 silent=FALSE ! multifilesink location=/tmp/nnst-src-7PVQZZ/temp.log"
        data_value = 98
        samp_freq = 1000
        data_bits = 32
        fd = 15
        bytes_to_read = 0
        bytes_read = 0
        data_buffer = 0x0
        expect_val = 3.40016132e-35
        actual_val = 0
        expect_val_mask = 13
        expect_val_char = 0x5b <error: Cannot access memory at address 0x5b>
        actual_val_char = 0xd <error: Cannot access memory at address 0xd>
        stat_buf = {st_dev = 4098, st_ino = 99359050, st_nlink = 21474836486, st_mode = 2419617024, st_uid = 96960311, 
          st_gid = 4, __pad0 = 0, st_rdev = 96867266, st_size = 4427248, st_blksize = 416441367174606080, 
          st_blocks = 175982624, st_atim = {tv_sec = 68702694848, tv_nsec = 4375412}, st_mtim = {tv_sec = 0, 
            tv_nsec = 17}, st_ctim = {tv_sec = 96868073, tv_nsec = 175982416}, __glibc_reserved = {4429819, 0, 
            4427672}}
        stat_ret = -16781680
        __FUNCTION__ = "TestBody"
#19 0x0000000000443852 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (
    object=0xa7d2c00, method=&virtual testing::Test::TestBody(), location=0x451e3b "the test body")
    at /usr/src/gtest/src/gtest.cc:2078
No locals.
#20 0x000000000043ea75 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (
    object=0xa7d2c00, method=&virtual testing::Test::TestBody(), location=0x451e3b "the test body")
    at /usr/src/gtest/src/gtest.cc:2114
No locals.
#21 0x00000000004248b8 in testing::Test::Run (this=0xa7d2c00) at /usr/src/gtest/src/gtest.cc:2151
        impl = 0x828f6d0
#22 0x00000000004250ea in testing::TestInfo::Run (this=0x8292450) at /usr/src/gtest/src/gtest.cc:2326
        impl = 0x828f6d0
        repeater = 0x828f920
        start = 1554712029886
        test = 0xa7d2c00
#23 0x0000000000425751 in testing::TestCase::Run (this=0x828fec0) at /usr/src/gtest/src/gtest.cc:2444
---Type <return> to continue, or q <return> to quit---
        i = 7
        impl = 0x828f6d0
        repeater = 0x828f920
        start = 1554712006878
#24 0x000000000042c62c in testing::internal::UnitTestImpl::RunAllTests (this=0x828f6d0)
    at /usr/src/gtest/src/gtest.cc:4315
        test_index = 0
        start = 1554712006851
        i = 0
        in_subprocess_for_death_test = false
        should_shard = false
        has_tests_to_run = true
        failed = false
        repeater = 0x828f920
        repeat = 1
        forever = false
#25 0x00000000004447a5 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x828f6d0, 
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x42c374 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x4526d0 "auxiliary test code (environments or event listeners)")
    at /usr/src/gtest/src/gtest.cc:2078
No locals.
#26 0x000000000043f8b5 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>
    (object=0x828f6d0, 
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x42c374 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x4526d0 "auxiliary test code (environments or event listeners)")
    at /usr/src/gtest/src/gtest.cc:2114
No locals.
#27 0x000000000042b224 in testing::UnitTest::Run (this=0x66fd00 <testing::UnitTest::GetInstance()::instance>)
    at /usr/src/gtest/src/gtest.cc:3926
        in_death_test_child_process = false
        premature_exit_file = {premature_exit_filepath_ = 0x0}
#28 0x000000000041aab8 in RUN_ALL_TESTS () at /usr/include/gtest/gtest.h:2288
No locals.
#29 0x000000000041a3e8 in main (argc=1, argv=0xffefff278) at ../tests/nnstreamer_source/unittest_src_iio.cpp:1253
No locals.
(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 20202 (tid 1 VgTs_Runnable) __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  2    Thread 20456 (tid 2 VgTs_WaitSys tensorsrciio7:sr) syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38

I was unable to reproduce the same issue over 12 hours of run on my pc without valgrind though. Note that tensor_src_iio uses no locks, and relies on gstbasesrc (and its parent classes) to use locks effectively.

A similar issue (deadlock while using valgrind) has also been reported here before (http://gstreamer-devel.966125.n4.nabble.com/Deadlock-under-valgrind-tsdemux-fault-td4663281.html). However, I cannot the issue being posted on bugzilla as replied by the followup in the post.

kparichay avatar Apr 08 '19 08:04 kparichay

Supporting valgrind without --leak-check=full seems feasible with some of the current test cases I tried (unittest_common, plugins, repo, sink and src_iio). However, running with valgrind exposes a deadlock issue inside gstreamer for both unittest_sink and unittest_src. The issues arises from the internals of gstreamer. For src_iio, the deadlock happens when transitioning to PAUSED state from READY state. The function gst_base_src_start_complete() is waiting for the lock to be released which is never released. The filtered debug log is:

0:00:15.393980575 18575      0x82a2c20 DEBUG             GST_STATES gstelement.c:2561:gst_element_set_state_func:<pipeline2> current READY, old_pending VOID_PENDING, next VOID_PENDING, old return SUCCESS
0:00:15.394381284 18575      0x82a2c20 DEBUG             GST_STATES gstbin.c:2664:gst_bin_change_state_func:<pipeline2> changing state of children from READY to PAUSED
0:00:15.400072879 18575      0x82a2c20 DEBUG             GST_STATES gstelement.c:2561:gst_element_set_state_func:<tensorsrciio4> current READY, old_pending VOID_PENDING, next VOID_PENDING, old return SUCCESS
0:00:15.400192833 18575      0x82a2c20 DEBUG             GST_STATES gstelement.c:2595:gst_element_set_state_func:<tensorsrciio4> final: setting state from READY to PAUSED
0:00:15.439533208 18575      0x88a82d0 DEBUG                basesrc gstbasesrc.c:523:gst_base_src_wait_playing:<tensorsrciio4> live source waiting for running state

More info from the gdb with valgrind -

(gdb) bt full
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00000000060e3e42 in __GI___pthread_mutex_lock (mutex=0xa7e7680) at ../nptl/pthread_mutex_lock.c:115
        id = 101622381
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x00000000066f42d3 in gst_base_src_start_complete () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#3  0x00000000066f4a1b in ?? () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#4  0x00000000066f4d68 in ?? () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#5  0x0000000005418537 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#6  0x0000000005418d0c in gst_pad_set_active () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#7  0x00000000053fadbd in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#8  0x000000000540b22c in gst_iterator_fold () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#9  0x00000000053fb21a in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#10 0x00000000053fcfaf in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#11 0x00000000053fd247 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#12 0x00000000066f2bbd in ?? () from /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0
No symbol table info available.
#13 0x00000000053ff1ce in gst_element_change_state () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#14 0x00000000053ff947 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#15 0x00000000053de365 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#16 0x00000000053ff1ce in gst_element_change_state () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#17 0x00000000053ff947 in ?? () from /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
No symbol table info available.
#18 0x0000000000412497 in test_tensor_src_iio_data_verify_no_trigger_bits32_alternate2_Test::TestBody (this=0xa7d2c00)
    at ../tests/nnstreamer_source/unittest_src_iio.cpp:1145
---Type <return> to continue, or q <return> to quit---
        dev0 = 0xa7d0140
        src_iio_pipeline = 0xa7fa8f0
        status = GST_STATE_CHANGE_SUCCESS
        state = GST_STATE_VOID_PENDING
        parse_launch = 0xa7e5690 "tensor_src_iio device=test-device-1 silent=FALSE ! multifilesink location=/tmp/nnst-src-7PVQZZ/temp.log"
        data_value = 98
        samp_freq = 1000
        data_bits = 32
        fd = 15
        bytes_to_read = 0
        bytes_read = 0
        data_buffer = 0x0
        expect_val = 3.40016132e-35
        actual_val = 0
        expect_val_mask = 13
        expect_val_char = 0x5b <error: Cannot access memory at address 0x5b>
        actual_val_char = 0xd <error: Cannot access memory at address 0xd>
        stat_buf = {st_dev = 4098, st_ino = 99359050, st_nlink = 21474836486, st_mode = 2419617024, st_uid = 96960311, 
          st_gid = 4, __pad0 = 0, st_rdev = 96867266, st_size = 4427248, st_blksize = 416441367174606080, 
          st_blocks = 175982624, st_atim = {tv_sec = 68702694848, tv_nsec = 4375412}, st_mtim = {tv_sec = 0, 
            tv_nsec = 17}, st_ctim = {tv_sec = 96868073, tv_nsec = 175982416}, __glibc_reserved = {4429819, 0, 
            4427672}}
        stat_ret = -16781680
        __FUNCTION__ = "TestBody"
#19 0x0000000000443852 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (
    object=0xa7d2c00, method=&virtual testing::Test::TestBody(), location=0x451e3b "the test body")
    at /usr/src/gtest/src/gtest.cc:2078
No locals.
#20 0x000000000043ea75 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (
    object=0xa7d2c00, method=&virtual testing::Test::TestBody(), location=0x451e3b "the test body")
    at /usr/src/gtest/src/gtest.cc:2114
No locals.
#21 0x00000000004248b8 in testing::Test::Run (this=0xa7d2c00) at /usr/src/gtest/src/gtest.cc:2151
        impl = 0x828f6d0
#22 0x00000000004250ea in testing::TestInfo::Run (this=0x8292450) at /usr/src/gtest/src/gtest.cc:2326
        impl = 0x828f6d0
        repeater = 0x828f920
        start = 1554712029886
        test = 0xa7d2c00
#23 0x0000000000425751 in testing::TestCase::Run (this=0x828fec0) at /usr/src/gtest/src/gtest.cc:2444
---Type <return> to continue, or q <return> to quit---
        i = 7
        impl = 0x828f6d0
        repeater = 0x828f920
        start = 1554712006878
#24 0x000000000042c62c in testing::internal::UnitTestImpl::RunAllTests (this=0x828f6d0)
    at /usr/src/gtest/src/gtest.cc:4315
        test_index = 0
        start = 1554712006851
        i = 0
        in_subprocess_for_death_test = false
        should_shard = false
        has_tests_to_run = true
        failed = false
        repeater = 0x828f920
        repeat = 1
        forever = false
#25 0x00000000004447a5 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x828f6d0, 
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x42c374 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x4526d0 "auxiliary test code (environments or event listeners)")
    at /usr/src/gtest/src/gtest.cc:2078
No locals.
#26 0x000000000043f8b5 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>
    (object=0x828f6d0, 
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x42c374 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x4526d0 "auxiliary test code (environments or event listeners)")
    at /usr/src/gtest/src/gtest.cc:2114
No locals.
#27 0x000000000042b224 in testing::UnitTest::Run (this=0x66fd00 <testing::UnitTest::GetInstance()::instance>)
    at /usr/src/gtest/src/gtest.cc:3926
        in_death_test_child_process = false
        premature_exit_file = {premature_exit_filepath_ = 0x0}
#28 0x000000000041aab8 in RUN_ALL_TESTS () at /usr/include/gtest/gtest.h:2288
No locals.
#29 0x000000000041a3e8 in main (argc=1, argv=0xffefff278) at ../tests/nnstreamer_source/unittest_src_iio.cpp:1253
No locals.
(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 20202 (tid 1 VgTs_Runnable) __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  2    Thread 20456 (tid 2 VgTs_WaitSys tensorsrciio7:sr) syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38

I was unable to reproduce the same issue over 12 hours of run on my pc without valgrind though. Note that tensor_src_iio uses no locks, and relies on gstbasesrc (and its parent classes) to use locks effectively.

A similar issue (deadlock while using valgrind) has also been reported here before (http://gstreamer-devel.966125.n4.nabble.com/Deadlock-under-valgrind-tsdemux-fault-td4663281.html). However, I cannot the issue being posted on bugzilla as replied by the followup in the post.

This issue has been resolved with nnsuite/nnstreamer#1359

kparichay avatar Apr 12 '19 02:04 kparichay

Issue with unittest_sink resolved with correct paths and nnsuite/nnstreamer#1361 .

kparichay avatar Apr 12 '19 06:04 kparichay

Both Tizen-OBS and Ubuntu-PPA should now support the updated SSAT with the option -vg. Please try them.

myungjoo avatar Apr 15 '19 13:04 myungjoo

  • [x] Add valgrind to ssat
  • [ ] Solve issues shown by valgrind
    • [x] tensor_src_iio nnsuite/nnstreamer#1359
    • [ ] tensor_sink nnsuite/nnstreamer#1369
    • [ ] tensor_decoder/tensor_converter nnsuite/nnstreamer#1367

kparichay avatar Apr 16 '19 05:04 kparichay

Both Tizen-OBS and Ubuntu-PPA should now support the updated SSAT with the option -vg. Please try them.

Verified that it works.

kparichay avatar Apr 17 '19 03:04 kparichay

How to use Valgrind on Gstreamer/NNStramer

  • with gst-tracers

    • Tracking memory leaks, GStreamer Conference 2016 - Berlin
      • Download: https://gstreamer.freedesktop.org/data/events/gstreamer-conference/2016/Guillaume%20Desmottes%20-%20Tracking%20Memory%20Leaks.pdf
      • GST_DEBUG="GST_TRACER:7" GST_TRACERS="leaks" gst-launch-1.0 videotestsrc num-buffers=10 ! Fakesink
  • with valgrind

    • export G_SLICE=always-malloc
    • https://developer.gnome.org/gstreamer/stable/gst-running.html
    • https://sourceforge.net/p/gstreamer/mailman/message/15426184/

leemgs avatar Jun 26 '20 05:06 leemgs