weaviate icon indicating copy to clipboard operation
weaviate copied to clipboard

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c...

Weaviate Weaviate logo

The ML-first vector search engine

Build Status Go Report Card Coverage Status Slack Newsletter

Description

Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale.

Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), and more. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance of a cloud-native database, all accessible through GraphQL, REST, and various language clients.

Weaviate helps ...

  1. Software Engineers (docs) - Who use Weaviate as an ML-first database for your applications.

    • Out-of-the-box modules for: NLP/semantic search, automatic classification and image similarity search.
    • Easy to integrate into your current architecture, with full CRUD support like you're used to from other OSS databases.
    • Cloud-native, distributed, runs well on Kubernetes and scales with your workloads.
  2. Data Engineers (docs) - Who use Weaviate as a vector database that is built up from the ground with ANN at its core, and with the same UX they love from Lucene-based search engines.

    • Weaviate has a modular setup that allows you to use your ML models inside Weaviate, but you can also use out-of-the-box ML models (e.g., SBERT, ResNet, fasttext, etc).
    • Weaviate takes care of the scalability, so that you don't have to.
    • Deploy and maintain ML models in production reliably and efficiently.
  3. Data Scientists (docs) - Who use Weaviate for a seamless handover of their Machine Learning models to MLOps.

    • Deploy and maintain your ML models in production reliably and efficiently.
    • Weaviate's modular design allows you to easily package any custom trained model you want.
    • Smooth and accelerated handover of your Machine Learning models to engineers.

GraphQL interface demo

Demo of Weaviate

Weaviate GraphQL demo on news article dataset containing: Transformers module, GraphQL usage, semantic search, _additional{} features, Q&A, and Aggregate{} function. You can the demo on this dataset in the GUI here: semantic search, Q&A, Aggregate.

Features

Weaviate makes it easy to use state-of-the-art ML models while giving you the scalability, ease of use, safety and cost-effectiveness of a purpose-built vector database. Most notably:

  • Fast queries
    Weaviate typically performs a 10-NN neighbor search out of millions of objects in considerably less than 100ms.

  • Any media type with Weaviate Modules
    Use State-of-the-Art ML model inference (e.g. Transformers) for Text, Images, etc. at search and query time to let Weaviate manage the process of vectorizing your data for you - or import your own vectors.

  • Combine vector and scalar search
    Weaviate allows for efficient combined vector and scalar searches, e.g “articles related to the COVID 19 pandemic published within the past 7 days”. Weaviate stores both your objects and the vectors and make sure the retrieval of both is always efficient. There is no need for third party object storage.

  • Real-time and persistent
    Weaviate lets you search through your data even if it’s currently being imported or updated. In addition, every write is written to a Write-Ahead-Log (WAL) for immediately persisted writes - even when a crash occurs.

  • Horizontal Scalability
    Scale Weaviate for your exact needs, e.g. High-Availability, maximum ingestion, largest possible dataset size, maximum queries per second, etc. (Multi-Node sharding since v1.8.0, Replication under development)

  • Cost-Effectiveness
    Very large datasets do not need to be kept entirely in memory in Weaviate. At the same time available memory can be used to increase the speed of queries. This allows for a conscious speed/cost trade-off to suit every use case.

  • Graph-like connections between objects
    Make arbitrary connections between your objects in a graph-like fashion to resemble real-life connections between your data points. Traverse those connections using GraphQL.

Documentation

You can find detailed documentation in the developers section of our website or directly go to one of the docs using the links in the list below.

Additional material

Video

Reading

Examples

You can find code examples here

Support

Contributing