[DOC]: Guide for ChatGPT prepration of own dataset for fine tuning.
📚 The doc issue
I'm taking a look to try this ChatGPT training.
However, it seems like it using the awesome-chatgpt-prompts as example.
Maybe we should provide some example for how to prepare a custom dataset?
Exactly, adding to what @luvwinnie mentioned, how to create a custom dataset out of some domain specific unstructured data ?
@novoforce @luvwinnie Thanks for your feedback. However, your issue is out of the scope of general issues. Here is an example of how to build our own dataset. https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat#supervised-datasets-collection In this project, we focus on providing a set of acceleration solutions for ChatGPT like model with the help of Colossal-AI. Users can adopt our training solutions to support various downstream tasks in their own cases. If you need customized in-depth cooperation or support, please send the details to [email protected] This issue was closed due to inactivity. Thanks.