langsmith-sdk Ability to add to split based on score

Ability to add to split based on score

Open davidgilbertson opened this issue 1 year ago • 0 comments

I would like to be able to select all the examples in a dataset that get a particular score in a particular experiment, then add them to a split.

Here's the workflow I want to follow when evaluating my RAG app's retrieval step:

Generate/upload a few hundred examples
Run an experiment with the fastest possible setup (e.g. only retrieving one document/chunk per example), scoring the results
Mark all the examples that score well as 'easy', and the rest as 'hard'
In future experiments, just focus on the 'hard' split (to save tokens, time, and got more signal in my scores)

On the experiment page, I can get as far as to singling out all the "score = 1" examples, but I can't then select them and put them in a split.

My use case here is specifically about adding to a split, but any bulk operation might be useful (deleting, moving to another dataset, etc)

My workaround is to capture/categorize the example IDs in code during evaluation and then use client.update_dataset_splits(...) to make the splits.

Oct 13 '24 20:10 davidgilbertson