seafowl icon indicating copy to clipboard operation
seafowl copied to clipboard

Support for Full Text Search Features

Open rupurt opened this issue 3 years ago • 2 comments

Howdy,

Absolutely love the idea in so many ways!!

Do you have any plans to support any of the full text search features from Postgres?

rupurt avatar Oct 26 '22 18:10 rupurt

Maybe, though this might better be suited as an upstream DataFusion question/feature suggestion. Note that because DataFusion is a column-oriented execution engine, it won't really be able to support PostgreSQL's indexing features, so you won't get any speed improvement over using WHERE some_text LIKE '%my_search_string%' (it'll have to inspect all values in a table partition anyway).

But there is some neat functionality like the tsvector type and the @@ operator. We don't have plans to build that yet: in terms of custom types and new operators, we're currently starting with https://github.com/splitgraph/seafowl/issues/137 to see how easy it is to get something similar to PG's JSON support working in DataFusion (which might require being able to define custom Arrow types and table-valued functions). After that's built, it should then be possible to build tsvector on top of JSON support and write some UDFs in WASM to get FTS working.

mildbyte avatar Oct 27 '22 09:10 mildbyte

tsvector + pg_trgm are exactly the types of things I would like to use for smallish search indexes at the edge. Many of these are slow changing so would be a great fit for caching.

rupurt avatar Oct 27 '22 14:10 rupurt