langsmith-sdk
langsmith-sdk copied to clipboard
Ability to add to split based on score
Feature request
I would like to be able to select all the examples in a dataset that get a particular score in a particular experiment, then add them to a split.
Motivation
Here's the workflow I want to follow when evaluating my RAG app's retrieval step:
- Generate/upload a few hundred examples
- Run an experiment with the fastest possible setup (e.g. only retrieving one document/chunk per example), scoring the results
- Mark all the examples that score well as 'easy', and the rest as 'hard'
- In future experiments, just focus on the 'hard' split (to save tokens, time, and got more signal in my scores)
On the experiment page, I can get as far as to singling out all the "score = 1" examples, but I can't then select them and put them in a split.
My use case here is specifically about adding to a split, but any bulk operation might be useful (deleting, moving to another dataset, etc)
My workaround is to capture/categorize the example IDs in code during evaluation and then use client.update_dataset_splits(...) to make the splits.