Open-Assistant
Open-Assistant copied to clipboard
Open Assistant as an AI Tutor
Overview
I researched promising datasets that we can incorporate into Open Assistant that would enable it to become an AI tutor for students. I have created a Google Document with relevant links, summaries, and dataset licensing information. Here is the link.
Instructions
- Create a Jupyter notebook in notebooks/data-augmentation which will download the data (you can upload it to Hugging Face or similar if it isn't already easily available).
- In the notebook, convert the data to a simple Q-A format which we need for training, e.g. JSONL where each line has prompt and response, and write it locally.
- Make a PR with the notebook (but don't include the downloaded data itself).
Improvement
For the Chain-of-Thought (CoT) datasets, Huu Nguyen proposed that we generate a question for each step instead of step by reasoning. The question will come from the assistant and the answer for the step will come from the human. The assistant helps the human to solve a problem step by step.
Would you like to try the COT->tutor conversion? @akhil-datla
I would like to collaborate with another contributor and learn from them. I am new to this process, but I am eager to learn!