Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Add PAQ dataset for close book QA

Open sbmaruf opened this issue 2 years ago • 1 comments

Add the PAQ dataset as a single turn dialogue of close-book QA,

https://github.com/facebookresearch/PAQ

sbmaruf avatar Jan 24 '23 02:01 sbmaruf

Adding the PAQ dataset as a single turn dialogue for close-book QA is an excellent suggestion. The PAQ dataset, developed by Facebook Research, provides a high-quality resource for training and evaluating models on closed-book question answering tasks.

We can integrate the PAQ dataset into our pipeline by first downloading the dataset and preprocessing it to match the format of our existing datasets. This may include splitting the data into the appropriate number of training and test sets, tokenizing the text, and creating a vocabulary.

Once the data is preprocessed, we can add it to our codebase and update the config file to include the new dataset. This will allow our models to train on and evaluate against the PAQ data.

Additionally, we can also consider adding a new evaluation script specifically for close-book QA tasks. This will make it easy to evaluate the performance of our models on the PAQ dataset and compare it against other datasets.

Overall, integrating the PAQ dataset into our pipeline will provide an additional valuable resource for training and evaluating our models on closed-book question answering tasks.

Let me know if you have any question about the integration process or any other concerns.

hemangjoshi37a avatar Jan 26 '23 19:01 hemangjoshi37a

Closing old data issue.

andreaskoepf avatar Jun 14 '23 09:06 andreaskoepf