knn icon indicating copy to clipboard operation
knn copied to clipboard

UpdatableSearcher cannot update reference vectors on the fly

Open openwzdh opened this issue 13 years ago • 2 comments

When trying to add new reference vectors into the searcher that is doing searches, ConcurrentLinkedDeque<Vector> is a thread safe alternative to the ArrayList<Vector>.

java.util.ConcurrentModificationException: null at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819) ~[na:1.7.0_09] at java.util.ArrayList$Itr.next(ArrayList.java:791) ~[na:1.7.0_09] at org.apache.mahout.knn.search.FastProjectionSearch.reindex(FastProjectionSearch.java:180) ~[knn-0.1-SNAPSHOT.jar:na] at org.apache.mahout.knn.search.FastProjectionSearch.search(FastProjectionSearch.java:111) ~[knn-0.1-SNAPSHOT.jar:na]

openwzdh avatar Dec 16 '12 10:12 openwzdh

Hi there!

Currently we're not using concurrent data structures to keep the overhead as low as possible. There's a thread about this on the Mahout mailing list [1] about this very issue.

But you're certainly welcome to modify the code to get it what you want to do. :)

Also, please track my branch [2]. It's a bit more up to date and is where the work is happening.

[1] http://mail-archives.apache.org/mod_mbox/mahout-user/201212.mbox/%3CCALzSx%2BzOMYBod%3DspWgrsf4Cenqzv%3DnSnsALUP%3DRt%3DXQe6e6SVQ%40mail.gmail.com%3E [2] https://github.com/dfilimon/knn

dfilimon avatar Dec 18 '12 10:12 dfilimon

Thank you for your suggestions, Filimon! Concurrent updating the index is more difficult to implement than expected, we temporarily settled down on a workaround method. When samples are added, a background thread builds a new searcher to replace the old one. It costs but it works. We will continue to develop the concurrent version and benchmark it.

openwzdh avatar Jan 10 '13 02:01 openwzdh