M. Yusuf Sarıgöz
M. Yusuf Sarıgöz
Thank you @lucidrains, kudos to you for the awesome work. Unlike Faiss's working as a library, Qdrant is a complete vector DB with support for distributed mode, filterable payload of...
Yeah, fair enough. We can implement support for them on top of that interface. Btw, I'm not a licensing expert but Qdrant's Apache 2.0 license should also be fine in...
Running computation with Android NN API requires a compute backend for Android NN API. It's possible to devote some efforts to develop such a backend if enough interest in adoption...
@ggerganov Would it be of interest to introduce Android NN API as a new backend? To be on the same page: - [ Android NN API](https://developer.android.com/ndk/guides/neuralnetworks) is a C library...
> It's definitely of interest. Great. I'll give it a test drive this week. > Best option would be if the Android API allows implementation of custom kernels, Its support...
@BarfingLemurs I digged into NPU specs, but it turned out to be that NPU's support for custom ops is limited. It supports a set of pre-defined quantization / dequantization ops...
Hey @dfiru thanks for reaching out --great to have you here from Quadric! I believe that GPNPU's approach is the right one, and I'm volunteer and definitely willing to explore...
Yes, Vulkan is the recommended approach https://developer.android.com/ndk/guides/graphics/getting-started Apparently, the Nomic team for gpt4all implemented a Vulkan backend, but I'm not sure about the compatibility of their custom license.
Yes there was a bug in batch inference, I'll expose it to Python after I make sure that it's completely fixed. We can also support writing directly to a Numpy...
> # embedding as an object that has save_to_npy() and to_list() method Sounds reasonable. Let me have a look. > also let me know of you planning something else in...