hnswlib icon indicating copy to clipboard operation
hnswlib copied to clipboard

Question--time complexity for high LID dataset

Open jianshu93 opened this issue 2 years ago • 1 comments

Hello hnswlib Team,

I am wondering for high Local Intrinsic dimension dataset, complexity for database building is still O(N*log(N))? Another question I have is, from a HNSW graph structure, we can extract top k best neighbors for each database element very efficiently right (from the bottom layer)? and compare to k NN graph (O(dn^t), 1<t<2)?

Thanks,

Jianshu

jianshu93 avatar Jul 03 '22 17:07 jianshu93

Hi @jianshu93,

  1. This is discussed in HNSW paper. The O(N*log(N)) is asymptotic scaling under (IMO) reasonable assumptions. That means that to observe this scaling on high LID datasets one might have to increase N to astronomical values to observe the expected scaling, similar to other algorithms with logarithmic scalability (e.g. kd-tree).
  2. Yeah. HNSW has an approximation of relative neighborhood graph which is very similar to the kNN graph (and better for routing), so extracting nearest neighbor should be a cheap operation there.

yurymalkov avatar Jul 05 '22 20:07 yurymalkov