kNN.jl icon indicating copy to clipboard operation
kNN.jl copied to clipboard

Use FLANN

Open lindahua opened this issue 11 years ago • 8 comments

FLANN (http://www.cs.ubc.ca/research/flann/) is one of the most widely used library for approximate nearest neighbor search.

It is fast & reliable, available in Linux distro & Homebrew, and has a C interface.

lindahua avatar Jan 05 '14 16:01 lindahua

Yes, we should definitely use FLANN.

johnmyleswhite avatar Jan 05 '14 16:01 johnmyleswhite

There are also a few other libraries we will want to look into at some point: http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neighbours-intro/

johnmyleswhite avatar Jan 05 '14 16:01 johnmyleswhite

This post is actually more informative: http://radimrehurek.com/2013/12/performance-shootout-of-nearest-neighbours-contestants/

johnmyleswhite avatar Jan 05 '14 16:01 johnmyleswhite

From this post, it appears to me that FLANN is the most reasonable choice at this point.

lindahua avatar Jan 05 '14 17:01 lindahua

I would suggest having a separate package (say FLANN.jl) as a wrapper, and let this depend on it.

lindahua avatar Jan 05 '14 17:01 lindahua

Yes, I think that's the right approach.

johnmyleswhite avatar Jan 05 '14 19:01 johnmyleswhite

Using FLANN requires manual memory management, because it maintains in-memory index. How does it fit into a proposed workflow of creating a model and using it for multiple predictions? It would require either clear resources the at the end of model usage or recalculate indexes every time when searching.

wildart avatar Jun 06 '14 22:06 wildart

Just treat the FLANN index like we treat other library that holds external resources (e.g. database connections).

We require the user to free the index when they have finished using it.

lindahua avatar Jun 06 '14 23:06 lindahua