PyBHV icon indicating copy to clipboard operation
PyBHV copied to clipboard

Boolean Hypervectors with various operators for experiments in hyperdimensional computing (HDC).

Python Boolean Hyper-Vectors

A rich research framework for hyperdimensional computing on large boolean vectors supporting program transformation and multiple backends for computation (plain Python, C++, NumPy, PyTorch). Many metrics and utility functions aim to aid the intuitive understanding of this new paradigm, and there are multiple levels of functionality available from the data marshalling and the basic (XOR, MAJ, PERMUTE)-algebra to cryptography support. All vector operations are implemented in C(++) and make use of bit-packing and SIMD, subprograms can be optimized and compiled to these operations in Python or C, and parallelization and pipelining are planned.

If your application is a relatively direct pipeline, take a look at HDCC. If you want a more stable library, or want to work with another base field than the booleans, use torch-hd. For C/C++, please see CBHV.

Overview

The fundamental research includes finding algebras with interesting properties on top of large boolean vectors. To this extent the library has laws used for testing and an expansive set of operators including:

  • Multiple types of fast random vector generation
  • Random and indexed select between vectors
  • Ability to slightly modify a vector, for example by flipping a fraction of its bits
  • Permutation, roll, and swapping with multiple interfaces
  • Hashing and encoding
  • Majority with multiple implementation
  • Sample, a cheap alternative to Majority
  • AND, OR, XOR, and NOT operators
  • Composite operations like SELECT (or MUX) and FLIP-FRAC (flipping a fraction of the bits)
  • Hamming, jaccard, cosine, bit-error-rate, tversky, and mutual-information metrics
  • A system for relatedness, unrelatedness, and standard deviations apart
  • zscore and pvalue

Additionally, provided are

  • Convenient utilities for probabilistic distances
  • A symbolic implementation with simplification, analysis, plotting and pretty printing
  • A native C++ implementation
  • Law and unification backed expression simplification
  • Compilation to operation sequences (circuits)
  • Efficient bit-packed representation (saves 8x memory compared to the traditional NumPy and PyTorch bool!)
  • Three redundant implementations on NumPy for performance and correctness
  • A (performant) plain Python implementation
  • A minimal abstraction for permutations with caching and composition
  • Very basic embeddings for other datatypes (more to come)
  • Graph visualization of distances in hyperdimensional space (see example).
  • Boolean expression and network synthesis (e.g. Cellular Automata and perfectly random functions)
  • Visualization and storage via pbm (e.g. Cellular Automata)
  • A normal form and conversions between different implementations and storage methods
  • Linear and adiabatic variants and example

Installation

Make sure you have a recent Python version, 3.10 is recommended.

pip install bhv

If you want a specific definition of "hyper" (the default is 8192) you can specify that as follows:

DIMENSION=512 pip install bhv Note, use multiples of 512 and preferably powers of 2.

If you only want to work with plain Python, you're good to go with from bhv.vanilla VanillaBHV as BHV.

For the native option, you need a modern C++ compiler and use from bhv.native import NativePackedBHV as BHV. The setup process should attempt to install this by default.

For interop with (the Python interface of) NumPy and PyTorch, you'll need pip install numpy or pip install torch with respectively from bhv.np import NumPyPacked64BHV as BHV or from bhv.np import TorchBoolBHV as BHV.

Getting started

Some resources to get started with the library, if you're looking for a broader intro, please take a look at hd-computing.com.

New to Hyperdimensional Computing

Basic uses (in the context of neo-GOFAI) are given in my presentation with a installation-free notebook.

The fundamental angle is to start is with Kanerva's initial paper together with the library. For that, multiple resources are provided:

  • A notebook going over the very basics
  • The grandmother example
  • A guide to picking metrics

As for a Machine Learning angle, you may enjoy:

  • A minimal implementation of classification based on the majority operator on a minimal problem
  • TODO replace image classification notebook
  • Graph classification notebook re-implementing GraphHD

Evaluating the library

If you like to dive into the code directly, I suggest the following entrypoints:

  • Finite State Machine example
  • The base class AbstractBHV
  • The most idiomatic implementation NumPyBoolBHV

Example exploratory usages of the library:

Note

This repository is (highly) active development, and a work-in-progress. Do expect changes to the naming, and even features to be swapped for more elegant alternatives.

The codebase also works with PyPy. Use the vanilla Python implementation. The numeric operations are slower than on CPython, but the symbolic ones are way faster.

If you have any feedback, raise an informal issue, or email me at [email protected]

If the library is not as fast as possible, that's a bug, please report.