Open-Assistant
Open-Assistant copied to clipboard
Where will OpenAssistant data live and will it be public?
Question I have been wondering about so may as well ask in here.
E.g. will we have some process for publishing the collected training data?
We collect multiple datasets for the project, e.g. see datasets.md. For the human feedback data that we collect it is definitely planned to release it after cleaning and PPI removal. Currently it is too early to say where exactly it will be available for download, but it will be easily discoverable when you visit the LAION website.