pcl
pcl copied to clipboard
[features] Fix ShapeContext3DEstimation
Currently, the x-axis (and consequently also the y-axis) are chosen randomly, which makes this feature useless without further processing. The paper by Frome et al says in section 2.1:
We have a degree of freedom in the azimuth direction that we must remove in order to compare shape contexts calculated in different coordinate systems. To account for this, we choose some direction to be Φ0 in an initial shape context, and then rotate the shape context about its north pole into L positions, such that each Φl division is located at the original 0 position in one of the rotations. For descriptor data sets derived from our reference scans, L rotations for each basis point are included, whereas in the query data sets, we include only one position per basis point
It seems like the PCL implementation is missing this additional step. There is a so far unimplemented function, seemingly for this purpose: https://github.com/PointCloudLibrary/pcl/blob/master/features/include/pcl/features/3dsc.h#L224 The task is to implement this function and call it in computeFeature, after computePoint.
When the implementation is fixed, the test in test_shot_estimation.cpp probably has to be updated. Ideally, a new test should be added where this feature is computed twice on the same cloud, and the outputs should be very similar.
The task is to implement this function and call it in
computeFeature, aftercomputePoint.
computeFeature is expected to return a PointCloud<pcl::ShapeContext1980>, but if we use shiftAlongAzimuth to do the rotation, then as described in the comment, the desc will be a vector at a length of descriptor_length_ * azimuth_bins_ (1980 * 12 = 23760 at default).
As a result, every point in the output cloud that supposed to store the feature at a size 1980 may overflow. So how to deal with this?
Maybe shiftAlongAzimuth was originally intended to be used in a different way. I think the size should be unchanged (1980).
In this commit, shiftAlongAzimuth was first introduced and called in computeFeature. The final descriptor was accepted by either a pcl::SHOT or a pcl::ShapeContext, which has now been deleted and replaced (see this). In this commit, shiftAlongAzimuth was deleted due to some bugs and is no longer called from computeFeature.
The pcl::SHOT or pcl::ShapeContext struct has a descriptor field of type std::vector, but now pcl::ShapeContext1980 only contains a fixed-sized array.
A possible solution is to add a new point type to store point-wise features(1980x12) for pcl::ShapeContext1980, acting as a descriptor dataset. But this seems not the best solution.
Ideally, a new test should be added where this feature is computed twice on the same cloud, and the outputs should be very similar.
Even if we implement the shift feature, determining if two features are very similar is challenging since rotation does not guarantee the order of L descriptors. Matching a query to an existing dataset requires comparing all L descriptors. One approach to measure similarity is by calculating the distance between two features; for example, we could calculate the minimum distances from a query to features from two runs and check if they are close enough.
Okay, this seems to be not as straightforward as I initially thought. I also had not checked how shiftAlongAzimuth was implemented and used in the past.
There are some assumptions that feature implementations have to meet in PCL to work well with other PCL classes (e.g. registration methods):
- feature similarity is usually measured by (euclidean) distance (as you also mentioned). This is especially the case when kdtrees are used to find features similar to a query feature. So a new 1980x12 feature type with a custom similarity function is not really possible
- many classes assume a one-to-one correspondence between points in the cloud and computed features. That means we can't make
computeFeaturereturn 12 timespcl::ShapeContext1980(different rotations) for every input point. - there are probably more, but these are the two which I could think of that are most relevant for this problem
By the way, the 3D Shape Context feature has an extension, the Unique Shape Context (USC, https://dl.acm.org/doi/10.1145/1877808.1877821). That one does not use random x- and y-axes, but computes a reference frame instead.
I think the best solution for the 3D Shape Context would be to deviate slightly from the paper, and not do L rotations, but just do one specific rotation on the result of computePoint. That rotation should be done so that the feature has a certain property, which would result in a small euclidean distance of two features computed on the same point with two different random x-axes. One idea could be to rotate such that the bin with the highest value in the feature is always in a specific azimuth direction. Then the initial, random x-axis would not matter any more.