CudaSift
CudaSift copied to clipboard
Suggestions for future versions?
After the latests commits I'm running out of ideas of what to improve and would like to hear if anyone has any suggestions for future versions. For further speed improvements, I can see the possibility of adding functionalities for uploading images that are not necessarily in floats, using half precision floats for storage and matching of SIFT vectors, as well as projecting vectors to a lower dimension, similar to PCA-SIFT. In most practical scenarios though, gaining a fraction of a millisecond doesn't help much, since there is much more around it that is more important. Thus the nature of the end application becomes more important than the actual feature extraction code.
Maybe cuda affine sift?
You mean [Yu & Morel, 2011], not [Mikolajczyk & Schmid, 2002], right? Do you have a feeling for how many angles you need to test in practice?
First question: yes. Second question: int ml = 6; //max, 3 works fine for most pictures. for(int tl = 1; tl < ml; tl++) { double t = pow(2, 0.5*tl); for(int phi = 0; phi < 180; phi += 72.0/t) { // extract sift points... } } 41 angles if ml=6. I restrict total sift point number to 16000, other wise matching costs too much time.
It's really a brute force method and I'm not sure it's the most effective one. The DoG responses might be more invariant to rotations than the descriptors, which means that just to detect points, you could use a sparser sampling of angles. Once you have detected a DoG point, you then do some further sampling of angles for the descriptor. How much overlap do you have of points from different angles, but at exactly the same location in the original image? Do you prune these points or do you keep them all through the matching and then possibly prune afterwards? On the other hand, even with the current version, a brute force method might be sufficient in most cases, in particular if some optimisations are done for image rotation and buffering of feature data. Getting fast enough matching should become a greater concern and for that some kind of pruning might become important.
Cuda Stream support?
What about OpenCV interoperability?