autolabel icon indicating copy to clipboard operation
autolabel copied to clipboard

[Feature Request]: Postprocessing to find noisy / incorrect autolabels

Open turian opened this issue 2 years ago • 0 comments

Is your feature request related to a problem? Please describe.

When autolabel gives incorrect labels, it would be great to have more automated tools to detect and correct them.

(This is a simplified version of proposal #449 and more details are there)

NOTE: I tried refuel's in-house compute_confidence once or twice but didn't get much from it, maybe I should try again. NOTE: I'm using gpt4 but maybe I should try text-davinci-003 to get logprobs? It's going to be sunset in January 2024, but let me know if I should give it shot.

Describe the solution you'd like

When training up a model on the autolabels, (e.g. using setfit, either on your own machines or it could be a refuel AI), there are a handful of ways to find labels that are possibly wrong:

  1. Split the data into k-folds. (There can be some subtleties if you want to avoid data leakage, e.g. don't use the same product in validation and train, but this is a detail that could be postponed until later.)
  2. Train on k-2 folds, do early stopping or simply validation on the another fold, do model selection of hyperparameters to minimize validation loss, and finally evaluate on test.
  3. Use the same hyperparameters (or do another hyperparameter search) on the other selection of folds, so that each fold is used for validation only once and another fold is use for test only once.

Find the test examples that are most widely misclassified. I mean, you can even find validation examples that are widely misclassified, or training examples that are near the margin or widely misclassified.

These are your noisy examples.

Have an automated process for going back and forth with GPT4 asking it to do solve the noisy examples using the original prompt. Explain that you believe the answer is wrong, and ask it to discuss further. When it converges upon a label, either digging in its heels or admitting it was wrong, you've now possibly cleaned up the label.

turian avatar Jul 12 '23 00:07 turian