Andrew Kane
Andrew Kane
Added an `l2_normalize` function in the commit above (for both `vector` and `halfvec`), which will be included in 0.7.0 (#508). Thanks for the suggestion @ataudt.
Hi @bohanliu5, thanks for the PR! I really appreciate all the work. It looks like this introduces a lot of complexity to the code. I think there's some that can...
Makes sense to me. I think the max memory needed would be `total memory needed for a HNSW_MAX_DIM element * (relation pages + TOAST pages)`. Edit: Nevermind, needs to be...
Yeah, the initial idea sounds simpler. Do you want to take this, or should I?
It should be possible to check the column storage with `TupleDescAttr(tupdesc, attnum)->attstorage`. Existing vectors could have different storages, but it's probably not a common case. I tried using DSA in...
Thanks for working on getting that fixed. I'm a bit hesitant to make a big change like this. Creating a large segment doesn't seem to immediately show up in memory...
Hi @seancarroll, thanks for the suggestion. Added an initial version (for exact search) in the [hamming-distance branch](https://github.com/pgvector/pgvector/compare/hamming-distance).
It looks like it's currently possible to do exact search with bit strings (without pgvector), fwiw. ```tsql CREATE FUNCTION hamming_distance(a varbit, b varbit) RETURNS float8 LANGUAGE SQL IMMUTABLE STRICT PARALLEL...
It might be worth trying exact search depending on the size of your data. ```sql CREATE TABLE items (id bigserial PRIMARY KEY, image_hash bit(256)); INSERT INTO items (image_hash) SELECT i::bit(256)...
The above query does similarity search for Hamming distance on 256 bit vectors (which is the hash size produced by PDQ). It gets the 10 nearest neighbors, but you could...