uwot icon indicating copy to clipboard operation
uwot copied to clipboard

Help needed . How to parallel RcppANNOY

Open jianshu93 opened this issue 3 years ago • 1 comments

Hello uwot developer,

This is the example I am using to testing ANNOY:

library(RcppAnnoy) set.seed(123)

mnist <- snedata::download_mnist() mnist_df <- mnist[,1:784] mnist_df_train <- mnist_df[1:60000,] mnist_df_search <- mnist_df[60001:70000,]

f <- 784 a <- new(AnnoyEuclidean, f) nrow(mnist_df_train) for (i in 1:nrow(mnist_df_train)) { a$addItem(i-1, as.numeric(as.matrix(mnist_df_train[i,]))) } a$build(100) for (i in 1:nrow(mnist_df_search)) { NN_out_i <- a$getNNsByVectorList(as.numeric(as.matrix(mnist_df_search[i,])), 5, -1, TRUE) }

How can I parallel the build and search step. RcppANNOY provide limited information.

Thanks,

Jianshu

jianshu93 avatar Feb 06 '22 03:02 jianshu93

The rcppannoy issues would be a better place to ask for help about RcppAnnoy.

As far as I remember, while multi-threading is available for building indices (but not searching) in the Annoy C++ library itself, it is not exposed by RcppAnnoy. uwot does not use the multi-threaded index building. It uses C++ to do the search in parallel.
https://github.com/jlmelville/uwot/blob/master/src/nn_parallel.h and https://github.com/jlmelville/uwot/blob/master/src/nn_parallel.cpp may be of use to you if you want to write C++.

jlmelville avatar Feb 06 '22 06:02 jlmelville