implement hyde
https://twitter.com/darrenangle/status/1652014961806196745
https://wfhbrian.com/revolutionizing-search-how-hypothetical-document-embeddings-hyde-can-save-time-and-increase-productivity/
https://arxiv.org/pdf/2212.10496.pdf
Semantic search on embeddings is hard to get right. Embedding long documents is a challenge. User queries are a challenge, if a user provides an ambiguous query they’ll get ambiguous matches
LLMs just follow what was in the context and hallucinate answers as a result
I've had a lot of success using HyDE for the query problem.
Essentially, let an LLM generate the query, or even use the hallucinated answer as the query.
if chat, fold the response back into the chat step with a prompt along the lines of "thought: I can use this data to answer the user"
works really well
Completed