nnext
nnext copied to clipboard
Main NNext Application Code
What's NNext
[Say: “NextDB”]
NNextDB is a blazingly fast ⚡️, neural search engine 🧠 to power your AI/ML apps 🦾. Deploy NNextDB on premise or on the cloud in minutes.
Installation | Contributing | Getting Started | Connecting To NNext
Installation
For detailed installation instructions, please see the Installation guide.
Nnext is supported on
![]() |
Docker | By far the easiest way to get up and running assuming you have docker installed. Pull the nnext:latest image from docker hub. |
![]() |
Build from Source | Build and install nnext from source using cmake & gcc/g++. Please follow the Compilation guide. |
![]() |
Debian Ubuntu |
Install NNext on Ubuntu using debian package manager. |
![]() |
MacOS | Install via homebrew |
![]() |
Kubernetes | 🚧 WIP 🚧 Create a NNext service on a kubernetes cluster |
Terraform + Kubernetes | 🚧 WIP 🚧 Create a NNext service via Terraform on a kubernetes cluster |
|
Terraform + NNext Cloud | 🚧 WIP 🚧 Provision a Cluster on NNext's cloud via terraform |
|
![]() |
Windows | 🚧 WIP 🚧 Not really supported, for purposes development only |
Quick Start
Here's a quick example showcasing how you can create an index, insert vectors/documents and search it on NNext.
Let's begin by installing the NNext server using docker.
docker run -p 6040:6040 -v/tmp/nnext/data:/data nnext/nnext:latest --data-dir /data
You should see output like this
...
...
[2022-04-27 13:02:10.029] [info] 🏁 Started NNext at ▸ 127.0.0.1:6040
Install the Python client for NNext:
pip install nnext
We can now initialize the client and create a movies
index:
import nnext
import numpy as np
from nnext import _and, _eq, _gte, _in
# Create and initialize the vector client
nnclient = nnext.Client(
nodes=[
{'host': 'localhost', 'port': '6040'}
])
Broadly speaking, you can create two types of indices
1. Simple indices
n_dim = 768
# Create an vector index.
nnindex = nnclient.index.create(
d=n_dim,
name='test_ANN')
n_vecs = 1000
k = 5
n_queries = 10
vectors = np.random.rand(n_vecs, n_dim)
# Insert vectors into the index.
nnindex.add(vectors)
# Create a query vector set.
q_vectors = np.random.rand(n_queries, n_dim)
# Now search the vectors.
_idx, _res = nnindex.search(q_vectors, k) # search
# The search operation returns a tuple of vectors and optionally the data
# associated with the vectors.
Documentation
All NNext Server and Client documentation, including pynext integration articles and helpful recipes, can be found at:
🚧 WIP 🚧
https://nnext.ai/docs/
FAQs
How does this differ from Faiss, ScaNN and Annoy?
First of all, NNext uses Faiss under the hood. The main thing to note about these software come as python packages installable via PIP or Conda. These libraries are very easy to use, from install to the API. However, while allowing you to quickly get started, they don't allow for persistence, index growth or high availability. If your application goes down for whatever reason, so do your search indices and data.
How does this differ from Milvus?
Milvus is a large piece of software, that takes non-trivial amount of effort to setup, administer, scale and fine-tune. It offers you a few thousand configuration parameters to get to your ideal configuration. So it's better suited for large teams who have the bandwidth to get it production-ready, regularly monitor it and scale it, especially when they have a need to store billions of documents and petabytes of data (eg: logs).
NNext is built specifically for decreasing the "time to market" for a delightful nearest-neighbor search experience. It is a light-weight yet powerful & scaleable alternative that focuses on Developer Happiness and Experience with a clean well-documented API, clear semantics and smart defaults so it just works well out-of-the-box, without you having to turn many knobs.
See a side-by-side feature comparison here.
How does this differ other fully managed solutions like Pinecone?
In brief - **no vendor lock-in**. Tired of using NNext cloud? Pack up your vectors and go. Obviously we don't want you to go, but if you have to, NNext Cloud allows you to download a compressed zip file containing the latest backup of your vectors to your machine. These vectors can then be used with another installation of NNext on premise or on another cloud provider.
Pinecone is a proprietary, hosted, nearest-neighbour search-as-a-service product that works well, when cost is not an issue. However, fast growing applications will quickly run into search & indexing limits, accompanied by expensive plan upgrades as they scale.
NNext on the other hand is an open-source product that you can run on your own infrastructure or use our managed SaaS offering - NNext Cloud. The open source version is free to use (besides of course your own infra costs). With NNext Cloud we do not charge by records or search operations. Instead, you get a dedicated cluster and you can throw as much data and traffic at it as it can handle. You only pay a fixed hourly cost & bandwidth charges for it, depending on the configuration your choose, similar to most modern cloud platforms.
From a product perspective, NNext is closer in spirit to Jina.ai than Pinecone.
See a side-by-side feature comparison here.
Why the Elastic License 2.0?
NNext Server is **source available**, **server software** and we expect users to typically run it as a separate daemon, and not integrate it with their own code. Elastic Licence 2.0 (EL2) covers and allows for this use case **generously**. We aim to set the minimum limitations necessary to strike a fair balance between freedom to use, share and change the software, and preventing actions that will harm the community.
If you have specifics that prevent you from using NNext due to a licensing issue, we're happy to explore this topic further with you. Please reach out to us [email protected].
I heard Elasticsearch and OpenSearch were planning on implementing ANN Search?
Fundamentally, Elasticsearch and it's variants, run on the JVM, which by itself can be quite an effort to tune to run optimally. NNext, on the other hand, is a single light-weight self-contained native binary, so it's simple to setup and operate. Furthermore, ANN search on Elasticseach runs as a secondary process, a sidecar, which is not natively supported by the main indexing engine.
Who is NNext?
NNext builds open-source ML-Ops software to help make development and deployment of machine learning applications painless.
Contributing
Introduction
First off, 🙏🏾 thank you for considering contributing to nnext. We value community contributions!
How can you help?
You may already know what you want to contribute -- a fix for a bug you encountered, or a new feature your team wants to use.
If you don't know what to contribute, keep an open mind! Here's some examples of helpful contributions that mean less work for you
- Improving documentation
- bug triaging
- writing tutorials
Checkout guide to contributing to learn more.