annoy-java icon indicating copy to clipboard operation
annoy-java copied to clipboard

C# (.NET) port

Open pengowray opened this issue 8 years ago • 2 comments

Sorry not sure where to put this, but I thought I should mention I wrote a C#/.NET port of this as part of my fork of Word2vec.Tools.

Pros / features:

  • appears to work correctly
  • no 2GB limit (tested with 5GB index file)
  • search_k support

To do / cons:

  • needs to be optimized (running gensim within a Docker image appears to be much faster)
  • doesn't yet use a Memory Mapped File
  • needs unit tests to verify correct results
  • needs performance tests (and compare to C and Java versions)
  • can only read an index; cannot create one (same as the Java version)
  • some messy "scaffolding" comments left over from the porting process need to be cleaned up / deleted

I haven't had time to work on it for a while so I thought I'd mention it in case anyone wanted to pick up the project or is looking for a starting point for their own C# port, or as reference if someone wants to backport features to the Java version.

https://github.com/quole/Word2vec.Tools/blob/master/Word2vec.Tools/AnnoyIndex.cs

pengowray avatar Oct 24 '17 13:10 pengowray

is it binary compatible? i mean can you load an index generated in annoy / annoy-java?

erikbern avatar Oct 24 '17 14:10 erikbern

Yes, it can load an index from annoy. It can't create its own (same as annoy-java)

pengowray avatar Oct 25 '17 01:10 pengowray