ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[DOC]: Guide for ChatGPT prepration of own dataset for fine tuning.

Open luvwinnie opened this issue 2 years ago • 1 comments

📚 The doc issue

I'm taking a look to try this ChatGPT training.

However, it seems like it using the awesome-chatgpt-prompts as example.

Maybe we should provide some example for how to prepare a custom dataset?

luvwinnie avatar Feb 24 '23 04:02 luvwinnie

Exactly, adding to what @luvwinnie mentioned, how to create a custom dataset out of some domain specific unstructured data ?

novoforce avatar Mar 01 '23 07:03 novoforce

@novoforce @luvwinnie Thanks for your feedback. However, your issue is out of the scope of general issues. Here is an example of how to build our own dataset. https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat#supervised-datasets-collection In this project, we focus on providing a set of acceleration solutions for ChatGPT like model with the help of Colossal-AI. Users can adopt our training solutions to support various downstream tasks in their own cases. If you need customized in-depth cooperation or support, please send the details to [email protected] This issue was closed due to inactivity. Thanks.

binmakeswell avatar Apr 20 '23 10:04 binmakeswell