depthai-ros
depthai-ros copied to clipboard
Segfault on mobile publisher
ROS2 Galactic running on RPI4B. Same issue with both OAK-D-LITE and OAK-D-PRO.
[mobilenet_node-9] Stack trace (most recent call last) in thread 5512:
[mobilenet_node-9] #5 Object "[", at 0, in nil
[mobilenet_node-9] #4 Object "linux-vdso.so.1", at 0xffff8eed45bf, in
[mobilenet_node-9] #3 Object "/usr/local/lib/libdepthai-core.so", at 0xffff8e01dbf3, in backward::SignalHandling::sig_handler(int, siginfo_t*, void*)
[mobilenet_node-9] #2 Object "/usr/local/lib/libdepthai-core.so", at 0xffff8e01db33, in backward::SignalHandling::handleSignal(int, siginfo_t*, void*)
[mobilenet_node-9] #1 Object "/usr/local/lib/libdepthai-core.so", at 0xffff8e01b387, in backward::StackTraceImpl<backward::system_tag::linux_tag>::load_here(unsigned long, void*, void*)
[mobilenet_node-9] #0 Object "/usr/local/lib/libdepthai-core.so", at 0xffff8e01e077, in unsigned long backward::details::unwind<backward::StackTraceImpl<backward::system_tag::linux_tag>::callback>(backward::StackTraceImpl<backward::system_tag::linux_tag>::callback, unsigned long)
[mobilenet_node-9] Segmentation fault (Address not mapped to object [(nil)])
[ERROR] [mobilenet_node-9]: process has died ...
and
[mobilenet_node-7] Stack trace (most recent call last) in thread 8077:
[mobilenet_node-7] #17 Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
[mobilenet_node-7] #16 Object "/lib/aarch64-linux-gnu/libc.so.6", at 0xffff8861f67b, in
[mobilenet_node-7] #15 Object "/lib/aarch64-linux-gnu/libpthread.so.0", at 0xffff884d64fb, in
[mobilenet_node-7] #14 Object "/lib/aarch64-linux-gnu/libstdc++.so.6", at 0xffff887aefab, in
[mobilenet_node-7] #13 Object "/usr/local/lib/libdepthai-core.so", at 0xffff88f7fe5b, in
[mobilenet_node-7] #12 Object "/usr/local/lib/libdepthai-core.so", at 0xffff88f7ff7f, in
[mobilenet_node-7] #11 Object "/usr/local/lib/libdepthai-core.so", at 0xffff88f8004b, in
[mobilenet_node-7] #10 Object "/usr/local/lib/libdepthai-core.so", at 0xffff88f80297, in
[mobilenet_node-7] #9 Object "/usr/local/lib/libdepthai-core.so", at 0xffff88f8041b, in
[mobilenet_node-7] #8 Object "/usr/local/lib/libdepthai-core.so", at 0xffff88f7b4ab, in
[mobilenet_node-7] #7 Object "/usr/local/lib/libdepthai-core.so", at 0xffff891872b3, in dai::XLinkStream::write(std::vector<unsigned char, std::allocator<unsigned char> > const&)
[mobilenet_node-7] #6 Object "/usr/local/lib/libdepthai-core.so", at 0xffff8918720f, in dai::XLinkStream::write(unsigned char const*, unsigned long)
[mobilenet_node-7] #5 Object "/usr/local/lib/libdepthai-core.so", at 0xffff89187d2b, in dai::XLinkWriteError::XLinkWriteError(XLinkError_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
[mobilenet_node-7] #4 Object "/usr/local/lib/libdepthai-core.so", at 0xffff891e6303, in
[mobilenet_node-7] #3 Object "/usr/local/lib/libdepthai-core.so", at 0xffff89204ab7, in
[mobilenet_node-7] #2 Object "/usr/local/lib/libdepthai-core.so", at 0xffff8920442b, in
[mobilenet_node-7] #1 Object "/usr/local/lib/libdepthai-core.so", at 0xffff891ce6ab, in
[mobilenet_node-7] #0 Object "/lib/aarch64-linux-gnu/libc.so.6", at 0xffff885d31c4, in
[mobilenet_node-7] Segmentation fault (Address not mapped to object [(nil)])
Seems to occur if I launch rviz2 after the mobile publisher, but sometimes it occurs on its own.
Looks like same error as #64 Trying to debug. Will get back on this.
@roni-kreinin can you share which version of depthai-core you are using ?
cc: @themarpe incase if you have an idea on this.
For OAK-D-LITE it is the main
branch and for OAK-D-PRO the oak-d-pro_develop
branch
got it. crash was for both devices ? and Other examples work fine for you ?
I think I may have installed the oakd-pro drivers incorrectly so I will re-test that. I have tested the stereo node and that works without segfaulting.
Tested again with oak-d-pro drivers installed correctly and using rqt_image_view I didn't have any issues. I have an RPLIDAR A1 also connected to the RPI4 and it seems that there are issues when I try to run both at the same time. Either the RPLIDAR node will fail or the mobile publisher node will segfault.
If we can't reproduce on our end - a capture with rr
debugger would also work.
(https://docs.google.com/document/d/1YRmwZP3gjcHY3UUO06LAh421Ea6gY4eKBCqJwHvcaIs/edit)
But we have to preferably compile with RelWithDebInfo
instead, to aid in later debugging.
(Will check if final size increase of such option, if it makes sense for it to be default)
I can try to get a capture but in the meantime here are some of my findings:
If the OAK-D is the only device connected to the RPI then the mobilenet node runs fine for a while, although it will sometimes freeze and stop publishing images without showing any warning or error messages in the output. I notice that during this time the mobilenet node will use up 100% CPU on one core of the RPI.
When it is working normally the node uses about 25% CPU split among the 4 cores.
I am able to reproduce this issue by connecting my RPLIDAR to the PI while the mobilenet node is running (OAK-D is on USB 3.0 port, RPLIDAR on 2.0). I do not get the segfault in this case either. If I restart the mobilenet node, it will run fine for some time but will eventually segfault. If I launch the RPLIDAR node while mobilenet is working, the RPLIDAR node will fail and mobilenet will freeze again with the 100% CPU usage issue.
Running something like rgb_stereo_node.launch.py
seems to work without crashing but the RPI struggles to publish the images at a decent rate and there is a significant delay in the images being updated. I also tested stereo.launch.py
without the metric converter, point cloud, and rviz nodes. I get pretty good fps (~15-20) but there is still about a 0.5s delay in the image being updated.
I don't have any issues using both the OAK-D and RPLIDAR with the stereo or rgb_stereo nodes, so it seems like something in the mobilenet node is causing issues with the USB devices.
Also, I don't think I am able to use rr
on an RPI4.
is the CPU consumption same on rgb_stereo_node.launch.py
and stereo.launch.py
? and is this on main branch or OAK-D-PRO-galactic
?
Those both use about 30% CPU. And this is on main
branch with OAK-D-LITE.
Also, I don't think I am able to use rr on an RPI4.
True, didn't realize you are running on RPi - rr
s aarch64 support isn't yet very good and non 64bit RPi OS runs the chip in ARMv7 mode anyway.
On my host I'm seeing not much CPU resources in Foxy and noetic. But Galactic seems to take more resources. I will test on Raspberry Pi over the weekend and get back to you.