Leonid Boytsov
Leonid Boytsov
Hi, thank you for the great repository! However, I found that for QA type classification, the code includes the sub-type into the training/testing data. Needless to say it's a perfect...
The function checkAndAddResults should treat ties, i.e., points at equal distances from the query more carefully. Currently, if you have the k-th element at distance d, and you call CheckAndAddResult(...,d)...
Instead of ``` #if defined(__GNUC__) #define PORTABLE_ALIGN16 __attribute__((aligned(16))) #else #define PORTABLE_ALIGN16 __declspec(align(16)) #endif ``` use [alignas](http://en.cppreference.com/w/cpp/language/alignas) **More info about alignment can be found here**: https://thenewcpp.wordpress.com/2012/11/02/alignment-support/
For short vectors our SIMD implementations aren't optimal, in particular, because we use dim % 4 scalar operations to compute the tail. Only this alone requires an expensive integer division....
1. Implement a better batch-based balanced multi-threaded brute-force search 2. For contiguous data use a more efficient approach to access data in an array-like fashion (rather than through Object)
Quite surprisingly the following lines in the method baseSearchAlgorithmV1Merge cause stable but misterious crash under Linux/icc: ``` _mm_prefetch((char *)(*iter)->getData(), _MM_HINT_T0); _mm_prefetch((char *)(massVisited + curId), _MM_HINT_T0); ``` Any slight modifications of...
HNSW violates C++ alignment rules when building an optimized index. This might potentially cause crashing in some cases. ``` /home/leo/SourceTreeGit/nmslib.dev/similarity_search/include/method/hnsw.h:333:15: warning: cast from 'char *' to 'int *' increases required...
Try to create for 100K records and then run with 10K.