pcl icon indicating copy to clipboard operation
pcl copied to clipboard

Do filter classes not need an aligned operator new?

Open themightyoarfish opened this issue 3 years ago • 7 comments

https://github.com/PointCloudLibrary/pcl/blob/c51f7a2dfc2097e74957143e29a0310f3c1a5514/filters/include/pcl/filters/crop_box.h#L179

I'm (again…) debugging a problem with PCL where Eigen::internal::handmade_aligned_free ultimately raises SIGSEGV during shared ptr deletion of point clouds whenever I use a class which uses CropBox on (different) point clouds. When not doing this processing, no problem. Now I realise the problem might be somewhere completely different, but it got me looking at the CropBox code, and I'm not seeing PCL_MAKE_ALIGNED_OPERATOR_NEW or anything in there, even though the class has Eigen types as members which would seem to require this.

Is this an oversight or not needed? (for the record, my problem did not go away when adding it),

themightyoarfish avatar Dec 16 '21 22:12 themightyoarfish

my problem did not go away when adding it

Perhaps, a stack trace would be nice.

Which version of C++ are you using? Since C++17, there aren't any issues in alignment. Also, switching off SSE/AVX/NEON achieved the same. Could you also try doing any 1 of these with:

  • PCL only
  • Your user code (if possible) This would help pin the blame firmly on PCL

But yeah, in a more general sense, the macro is needed

kunaltyagi avatar Dec 17 '21 04:12 kunaltyagi

PR #4962 will add the macro PCL_MAKE_ALIGNED_OPERATOR_NEW to CropBox (will be merged after PCL 1.12.1 is released)

mvieth avatar Dec 17 '21 11:12 mvieth

For what it's worth, I noticed that PCL 1.12 compile flags did not get downstreamed to my project using it, and also that PCL was no longer compiling with -march=native, which 1.11.1 still does. Manually adding everything which PCL 1.12 prints as build flags did not change the problem. But downgrading to 1.11.1 makes the problem go away. The os is Ubuntu 16.04 with C++ 14. Just putting this out here if anyone else has this issue, I can't really contribute more right now unfortunately.

The backtrace with 1.12 is this:

(lldb) bt
error: autocrane-core_main 0x0bfb0e89: DW_TAG_member 'data' refers to type 0x0bfb06ef which extends beyond the bounds of 0x0bfb0e50
error: autocrane-core_main 0x0bfb0ec5: DW_TAG_member 'data_c' refers to type 0x0bfb06ef which extends beyond the bounds of 0x0bfb0e9d
error: autocrane-core_main 0x011d4d2f: DW_TAG_member '_M_local_buf' refers to type 0x011e8d93 which extends beyond the bounds of 0x011d4d27
error: autocrane-core_main 0x011e85a4: DW_TAG_member '__size' refers to type 0x011e85bb which extends beyond the bounds of 0x011e851f
* thread #1: tid = 32290, 0x00007fffdd622562 libc.so.6`__GI___libc_free(mem=0x0000000000002311) + 34 at malloc.c:2958, name = 'autocrane-core_', stop reason = signal SIGSEGV: invalid address (fault address: 0x2309)
  * frame #0: 0x00007fffdd622562 libc.so.6`__GI___libc_free(mem=0x0000000000002311) + 34 at malloc.c:2958
    frame #1: 0x0000000000513047 autocrane-core_main`Eigen::internal::handmade_aligned_free(ptr=0x00000000028ae0d0) + 38 at Memory.h:118
    frame #2: 0x00000000005130a8 autocrane-core_main`Eigen::internal::aligned_free(ptr=0x00000000028ae0d0) + 24 at Memory.h:206
    frame #3: 0x000000000052a87c autocrane-core_main`Eigen::aligned_allocator<pcl::PointXYZI>::deallocate(this=0x0000000002315cd0, p=0x00000000028ae0d0, (null)=280) + 32 at Memory.h:897
    frame #4: 0x000000000052468a autocrane-core_main`std::allocator_traits<Eigen::aligned_allocator<pcl::PointXYZI> >::deallocate(__a=0x0000000002315cd0, __p=0x00000000028ae0d0, __n=280) + 43 at alloc_traits.h:386
    frame #5: 0x000000000051e070 autocrane-core_main`std::_Vector_base<pcl::PointXYZI, Eigen::aligned_allocator<pcl::PointXYZI> >::_M_deallocate(this=0x0000000002315cd0, __p=0x00000000028ae0d0, __n=280) + 50 at stl_vector.h:178
    frame #6: 0x000000000051de05 autocrane-core_main`std::_Vector_base<pcl::PointXYZI, Eigen::aligned_allocator<pcl::PointXYZI> >::~_Vector_base(this=0x0000000002315cd0) + 65 at stl_vector.h:160
    frame #7: 0x0000000000517b1d autocrane-core_main`std::vector<pcl::PointXYZI, Eigen::aligned_allocator<pcl::PointXYZI> >::~vector(this=0x0000000002315cd0) + 65 at stl_vector.h:425
    frame #8: 0x0000000000514510 autocrane-core_main`pcl::PointCloud<pcl::PointXYZI>::~PointCloud(this=0x0000000002315ca0) + 28 at point_cloud.h:172
    frame #9: 0x0000000000578e26 autocrane-core_main`std::_Sp_counted_ptr<pcl::PointCloud<pcl::PointXYZI>*, (__gnu_cxx::_Lock_policy)2>::_M_dispose(this=0x0000000002315d40) + 34 at shared_ptr_base.h:374
    frame #10: 0x0000000000455eae autocrane-core_main`std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(this=0x0000000002315d40) + 66 at shared_ptr_base.h:150
    frame #11: 0x00000000004535af autocrane-core_main`std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count(this=0x0000000001f7cbd8) + 39 at shared_ptr_base.h:659
    frame #12: 0x00000000005142fc autocrane-core_main`std::__shared_ptr<pcl::PointCloud<pcl::PointXYZI> const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr(this=0x0000000001f7cbd0) + 28 at shared_ptr_base.h:925
    frame #13: 0x0000000000514318 autocrane-core_main`std::shared_ptr<pcl::PointCloud<pcl::PointXYZI> const>::~shared_ptr(this=0x0000000001f7cbd0) + 24 at shared_ptr.h:93

<Destructor of object with Cloud::Ptr as a member>

Reliably reproducible during destructor after CropBox was used on something.

themightyoarfish avatar Dec 17 '21 20:12 themightyoarfish

Do you supply any CMAKE_CXX_FLAGS on the cmake call? Because then it won't try to use -march=native. If you supply default flags to PCL, it should try to use -march=native.

Flags given at cmake call to CMAKE_CXX_FLAGS, is, as far as I can see, not passed to downstream projects.

larshg avatar Dec 17 '21 22:12 larshg

The issue seems to be pcl::PointCloud<pcl::PointXYZI>, not the CropBox.

Could you try adding PCL_MAKE_ALIGNED_OPERATOR_NEW to the struct PointXYZI on line 509 in common/include/impl/point_types.hpp?

kunaltyagi avatar Dec 18 '21 04:12 kunaltyagi

downgrading to 1.11.1 makes the problem go away.

The thing is there isn't any code change in point types which should result in a discrepancy between 1.11.0 and current master. I think the issue could be due to our different handling of CMake flags.

kunaltyagi avatar Dec 18 '21 04:12 kunaltyagi

Good to know, but I am not passing anything manually. But I did not fully clean the build between switching tags and reinstalling, maybe old flags stick around somehow.

themightyoarfish avatar Dec 18 '21 12:12 themightyoarfish