Heikki Linnakangas

Results 148 comments of Heikki Linnakangas

I checked the production logs for this stack trace (with the `add_slru_segment`) again. The last instance was at 2023-12-21T11:57:53.358761Z, on us-east-2, but nothing since. Did the deployment happen before that?...

Checked the logs again. There are two new instances of that error in the logs, on 2024-01-10 and 2024-01-11. Looking at the operations log for that project around the time...

Does scalar quantization only make sense with IVF index, or can it be used with HNSW too?

If we know the number of rows in the table, we can calculate the memory needed by # of rows * sizeof(vector with N dimensions). The space needed for neighbor...

Hmm, assuming MaxHeapTuplesPerPage gives a very high upper bound. Vectors are large, so in practice you can only fit a few of them on each page, but MaxHeapTuplesPerPage is 291....

> Yeah, the initial idea sounds simpler. Do you want to take this, or should I? I don't have any great ideas on this at the moment

BTW, did you consider using the `dsa.c` facility for the shared memory area? It's a pretty good fit for what we'd need, it can expand the shared memory area on...

> I tried using DSA in https://github.com/pgvector/pgvector/issues/409#issuecomment-1894993783 ([hnsw-fast-build-dsa branch](https://github.com/pgvector/pgvector/compare/hnsw-fast-build...hnsw-fast-build-dsa)), but DSA_ALLOC_NO_OOM still throws an error when running out of shared memory (at least in Docker). Huh, yeah, that's a bug....

Does this produce different results than current 'master', if you stop the iteration so that you get the same number of results?

I want to bring up this topic that I also wrote in a comment on my branch: Currently on 'master', 'ef' parameter controls how many candidates to return. In the...