AlpacaDataCleaned icon indicating copy to clipboard operation
AlpacaDataCleaned copied to clipboard

Alpaca dataset from Stanford, cleaned and curated

Results 11 AlpacaDataCleaned issues
Sort by recently updated
recently updated
newest added

Does it support cleaning Chinese sft data?

Where is the 9k cleaned alpaca data in the paper Alpagasus?

Thanks for the great work! I'm trying to reproduce the results you report. I downloaded the model weights from link https://huggingface.co/yahma/alpaca-7b-lora and evaluated them under the framework of lm-evaluation-harness. But...

how does PIQA got 78 acc ? I see the eval folder's readme file, it says the metric is not trustworthy ?

Hi thanks for the great work! could I ask for the command used to run the evaluation on the https://github.com/EleutherAI/lm-evaluation-harness/? one that is running on any dataset is fine, I...

I want to translate the training data into another language with Google translate, but code snippets should not be translated, so I have to replace code snippets with placeholders before...

Planning on adding an evaluation metric that can be used to benchmark trained alpaca models. Going to focus on these two datasets for evaluation: 1. [SquaD Dataset](https://huggingface.co/datasets/squad) - F1 Score...

enhancement

Would that be relevant in the scope of this project? Like adding a couple sorts of task examples could improve its generalized capabilities, for instance: Longer responses GPT-4 Generated Responses...

1. Collect money (~500 USD should be enough I believe?) 2. Open account with OpenAI and connect it to the bank account holding the money above 3. Get API key...

How are you going about cleaning this? Manually or with GPT-4.