Jonathan S. Katz

Results 122 comments of Jonathan S. Katz

From some recent testing, I do think there may be some headroom on HNSW scans with higher `hnsw.ef_search` values (e.g. `hnsw.ef_search > 200` and esp. `hnsw.ef_search > 800`). I've observed...

@knizhnik Can you please provide a summary/description of what this patchset does? I didn't see anything in the commit messages, and I want to evaluate it holistically. Thanks!

Are there any tests/benchmarks comparing this updated method to the older method? How does this impact performance/recall?

Left some minor stylistic comments. I'm still thinking through this one and need to test. I primarily stared at HNSW for a bit. I agree with the initial clause that...

> The issue with low selectivity using a vector index is the query will likely return few or no results, which happens now (https://github.com/pgvector/pgvector/issues/263). @ankane I understand that -- but...

> The odds should be the same with the same ef_search. The index will find ef_search results, and only 5% of them will match. I'm not following. The selectivity in...

@GPF199541 I'm not following, today you can do: ```sql SET ivfflat.probes TO 10; SELECT * FROM emb_table WHERE ... ``` What this proposes is setting a default for using the...

@xfalcox Yes, I'd be interested in pursuing it -- the RFC is out there to collect use cases before adding to the code :smile: I don't think a PR would...

🚀 I still have on my TODO to run some serious benchmarks on these patches; I'll work to get those up and running to see how it performs under different...

@hlinnaka Do you have any rough performance stats based on this change?