tsearch icon indicating copy to clipboard operation
tsearch copied to clipboard

Search algorithm

Open gillchristian opened this issue 5 years ago • 2 comments

The initial PoC version of the search algorithm was very naive. And currently it's broken as part of the improvements in the extract.

Being able to search is the whole idea of the project, so this should be the main focus.

There are some insights about the Hoogle search algorithm in this episode of The Haskell Cast.

A few things to consider:

  • Semantics of TS should be taken into consideration (e.g. any should prioritize matching any abut should also include everything else).
  • We should optimize the extracted data for search, currently is just on big array :smirk:
  • Let's start with a naive version first that doesn't consider all the semantics (I don't think the goal should ever be to have all the semantics into consideration for search) but can produce some meaningful results already. And then iterate on small improvements.
  • For this issue to be solved we don't need an implementation but just a design of the first version of the search algorithm.

gillchristian avatar Sep 07 '19 16:09 gillchristian

There are some details about how Hoogle handles this here. What I get from that is that it uses an approach similar to the one used to compute the edit distance between two strings, but here the operations are things like "reorder the parameters" or "change some type".

I think doing something like that as a first approach should be enough. You can even have the same cost for each operation (later these costs can be hand-picked to get nicer orders).

fvictorio avatar Sep 12 '19 23:09 fvictorio

Here are some notes/thoughts on the algorithm, will keep adding stuff there until I feel it makes sense to start with a PoC.

(cc) @fvictorio

gillchristian avatar Oct 09 '19 19:10 gillchristian