stanford_alpaca icon indicating copy to clipboard operation
stanford_alpaca copied to clipboard

Alpaca dataset token probabilities

Open chiayewken opened this issue 1 year ago • 0 comments

Great work, this is a very exciting direction! In addition to the raw text data in alpaca_data.json, are you able to release the token probabilities generated by GPT-3 for each sample? This can help in detecting noisy samples, or select certain training samples with higher confidence.

chiayewken avatar Mar 21 '23 17:03 chiayewken