navxmxp icon indicating copy to clipboard operation
navxmxp copied to clipboard

C++ SIGSEGV in

Open jhh opened this issue 8 years ago • 1 comments

I'm seeing a reproducible crash that looks to be happening in libnavx_frc_cpp.a.

Here is the test program: https://github.com/strykeforce/roborio/blob/master/src/navx_test.cc. To reproduce:

  1. We routinely stop NI auto-run runtime when running test robot code from the command-line: /etc/init.d/nilvrt stop
  2. Run test program on roboRIO in gdb
  3. Enable in driver station
  4. Disable in driver station
  5. SIGSEGV

And here is the stack trace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 5748]
0x00025678 in OffsetTracker::UpdateHistory (this=0x0, curr_value=-3.28999996) at ..\src/OffsetTracker.cpp:21
21      ..\src/OffsetTracker.cpp: No such file or directory.
(gdb) bt
#0  0x00025678 in OffsetTracker::UpdateHistory (this=0x0, curr_value=-3.28999996) at ..\src/OffsetTracker.cpp:21
#1  0x00022600 in AHRSInternal::SetAHRSData (this=0x8d778, ahrs_update=..., sensor_timestamp=787208)
    at ..\src/AHRS.cpp:150
#2  0x00026600 in RegisterIO::GetCurrentData (this=0x8f370) at ..\src/RegisterIO.cpp:187
#3  0x00025d90 in RegisterIO::Run (this=0x8f370) at ..\src/RegisterIO.cpp:86
#4  0x00021964 in AHRS::ThreadFunc (io_provider=0x8f370) at ..\src/AHRS.cpp:1305
#5  0x00024bf8 in std::_Bind_simple<int (*(IIOProvider*))(IIOProvider*)>::_M_invoke<0u>(std::_Index_tuple<0u>) (
    this=0x8f248) at c:\frc\arm-frc-linux-gnueabi\include\c++\4.9.3/functional:1700
#6  0x00024a70 in std::_Bind_simple<int (*(IIOProvider*))(IIOProvider*)>::operator()() (this=0x8f248)
    at c:\frc\arm-frc-linux-gnueabi\include\c++\4.9.3/functional:1688
#7  0x000249d4 in std::thread::_Impl<std::_Bind_simple<int (*(IIOProvider*))(IIOProvider*)> >::_M_run() (
    this=0x8f23c) at c:\frc\arm-frc-linux-gnueabi\include\c++\4.9.3/thread:115
#8  0xb6e5a748 in ?? () from /usr/lib/libstdc++.so.6
#9  0xb6e8ee50 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

jhh avatar Jan 18 '17 22:01 jhh

The root cause of the exception in this case is that the AHRS class destructor does not properly shut down the IO thread, and the next time data is read, the internal notifier accesses a null pointer.

To resolve this, the AHRS class destructor should stop the IO thread.

Priority is currently believed to be low, as the typical use case is that the lifetime of the AHRS class is that of the entire robot application. If this priority assignment appears incorrect, please update the issue w/the rationale.

kauailabs avatar Jan 18 '17 22:01 kauailabs