nnext icon indicating copy to clipboard operation
nnext copied to clipboard

Main NNext Application Code

Apollo Client

What's NNext

[Say: “NextDB”]

NNextDB is a blazingly fast ⚡️, neural search engine 🧠 to power your AI/ML apps 🦾. Deploy NNextDB on premise or on the cloud in minutes.

Installation | Contributing | Getting Started | Connecting To NNext

Installation

For detailed installation instructions, please see the Installation guide.

Nnext is supported on

Docker By far the easiest way to get up and running assuming you have docker installed.
Pull the nnext:latest image from docker hub.
Build from Source Build and install nnext from source using cmake & gcc/g++. Please follow the Compilation guide.
Debian
Ubuntu
Install NNext on Ubuntu using debian package manager.
MacOS Install via homebrew
Kubernetes 🚧 WIP 🚧
Create a NNext service on a kubernetes cluster
Terraform + Kubernetes 🚧 WIP 🚧
Create a NNext service via Terraform on a kubernetes cluster
Terraform + NNext Cloud 🚧 WIP 🚧
Provision a Cluster on NNext's cloud via terraform
Windows 🚧 WIP 🚧
Not really supported, for purposes development only

Quick Start

Here's a quick example showcasing how you can create an index, insert vectors/documents and search it on NNext.

Let's begin by installing the NNext server using docker.

docker run -p 6040:6040 -v/tmp/nnext/data:/data nnext/nnext:latest --data-dir /data

You should see output like this

...
...
[2022-04-27 13:02:10.029] [info] 🏁 Started NNext at ▸ 127.0.0.1:6040

Install the Python client for NNext:

pip install nnext

We can now initialize the client and create a movies index:

import nnext
import numpy as np
from nnext import _and, _eq, _gte, _in

# Create and initialize the vector client
nnclient = nnext.Client(
    nodes=[
    {'host': 'localhost', 'port': '6040'}
  ])

Broadly speaking, you can create two types of indices

1. Simple indices

n_dim = 768

# Create an vector index.
nnindex = nnclient.index.create(
    d=n_dim,
    name='test_ANN')

n_vecs = 1000
k = 5
n_queries = 10
vectors = np.random.rand(n_vecs, n_dim)

# Insert vectors into the index.
nnindex.add(vectors)

# Create a query vector set.
q_vectors = np.random.rand(n_queries, n_dim)

# Now search the vectors.
_idx, _res = nnindex.search(q_vectors, k)  # search

# The search operation returns a tuple of vectors and optionally the data
# associated with the vectors.

Documentation

All NNext Server and Client documentation, including pynext integration articles and helpful recipes, can be found at:

🚧 WIP 🚧
https://nnext.ai/docs/

FAQs

How does this differ from Faiss, ScaNN and Annoy?

First of all, NNext uses Faiss under the hood. The main thing to note about these software come as python packages installable via PIP or Conda. These libraries are very easy to use, from install to the API. However, while allowing you to quickly get started, they don't allow for persistence, index growth or high availability. If your application goes down for whatever reason, so do your search indices and data.

How does this differ from Milvus?

Milvus is a large piece of software, that takes non-trivial amount of effort to setup, administer, scale and fine-tune. It offers you a few thousand configuration parameters to get to your ideal configuration. So it's better suited for large teams who have the bandwidth to get it production-ready, regularly monitor it and scale it, especially when they have a need to store billions of documents and petabytes of data (eg: logs).

NNext is built specifically for decreasing the "time to market" for a delightful nearest-neighbor search experience. It is a light-weight yet powerful & scaleable alternative that focuses on Developer Happiness and Experience with a clean well-documented API, clear semantics and smart defaults so it just works well out-of-the-box, without you having to turn many knobs.

See a side-by-side feature comparison here.

How does this differ other fully managed solutions like Pinecone?

In brief - **no vendor lock-in**. Tired of using NNext cloud? Pack up your vectors and go. Obviously we don't want you to go, but if you have to, NNext Cloud allows you to download a compressed zip file containing the latest backup of your vectors to your machine. These vectors can then be used with another installation of NNext on premise or on another cloud provider.

Pinecone is a proprietary, hosted, nearest-neighbour search-as-a-service product that works well, when cost is not an issue. However, fast growing applications will quickly run into search & indexing limits, accompanied by expensive plan upgrades as they scale.

NNext on the other hand is an open-source product that you can run on your own infrastructure or use our managed SaaS offering - NNext Cloud. The open source version is free to use (besides of course your own infra costs). With NNext Cloud we do not charge by records or search operations. Instead, you get a dedicated cluster and you can throw as much data and traffic at it as it can handle. You only pay a fixed hourly cost & bandwidth charges for it, depending on the configuration your choose, similar to most modern cloud platforms.

From a product perspective, NNext is closer in spirit to Jina.ai than Pinecone.

See a side-by-side feature comparison here.

Why the Elastic License 2.0?

NNext Server is **source available**, **server software** and we expect users to typically run it as a separate daemon, and not integrate it with their own code. Elastic Licence 2.0 (EL2) covers and allows for this use case **generously**. We aim to set the minimum limitations necessary to strike a fair balance between freedom to use, share and change the software, and preventing actions that will harm the community.

If you have specifics that prevent you from using NNext due to a licensing issue, we're happy to explore this topic further with you. Please reach out to us [email protected].

I heard Elasticsearch and OpenSearch were planning on implementing ANN Search?

Fundamentally, Elasticsearch and it's variants, run on the JVM, which by itself can be quite an effort to tune to run optimally. NNext, on the other hand, is a single light-weight self-contained native binary, so it's simple to setup and operate. Furthermore, ANN search on Elasticseach runs as a secondary process, a sidecar, which is not natively supported by the main indexing engine.

Who is NNext?

NNext builds open-source ML-Ops software to help make development and deployment of machine learning applications painless.

Contributing

Introduction

First off, 🙏🏾 thank you for considering contributing to nnext. We value community contributions!

How can you help?

You may already know what you want to contribute -- a fix for a bug you encountered, or a new feature your team wants to use.

If you don't know what to contribute, keep an open mind! Here's some examples of helpful contributions that mean less work for you

  • Improving documentation
  • bug triaging
  • writing tutorials

Checkout guide to contributing to learn more.