viberary icon indicating copy to clipboard operation
viberary copied to clipboard

Allow flagging bad results to capture data for refinement

Open xdg opened this issue 2 years ago • 1 comments

I shared the site with some friends and one reported that "space opera" came back with Gaston Leroux The Phantom of the Opera, which is a pretty big miss. Perhaps you could add a thumbs down or other method to capture data about bad recommendations to improve the training.

xdg avatar Aug 03 '23 10:08 xdg

Thanks @xdg , this is good feedback and a great idea! It goes hand in hand with something I've been thinking about to improve model performance.

In general, semantic search and clustering are hard problems to solve, especially combined with query understanding.. The problem is that, without guiding the search results, they will only match on similarity, which is what your friend saw here, and it's very likely in the first pass to get weird stuff. as is shown here:

image

There's a number of different approaches that we can take to tune the model, each might be more or less successful, but we'll need to do all of these in combination to get better results:

  • The way the model currently works, there is no training that happens - I use a pretrained sentence-transformers model. One approach might be to fine-tune the model with logged results for better query understanding
  • Another might be to tune the hyperparameters of the model itself, i.e. make the cosine similarity threshold higher or change the edges
  • A third might be to simply hand-filter some bad results

In all of these cases, logging like you mentioned is important to see how many misses are actually happening in production. Stay tuned for the development of thumbs up/thumbs down-style feedback. I'll need to:

  • Create the UX elements
  • Wire up an inspected click to logging
  • Start systemically collecting logs and analyzing them
  • Make sure I have enough space to collect feedback logs
  • Implement a system to look at these logs quickly (Kibana would be nice but it's a lot of overhead for this project atm)

Keeping this open for now and letting you know I'm thinking about it, just might take a while to implement :)

veekaybee avatar Aug 06 '23 11:08 veekaybee