TigerBot
TigerBot copied to clipboard
How to make data like tigerbot-wiki-plugin become instruction-tuning-like format ?
Hi thanks for the great work, when I browse through the website and find this dataset
tigerbot-wiki-plugin, the keys contain ["content", "wiki_id", "url"] which I believe the only valuable content to learn is "content" part.
I believe the instruction tuning is applied from the way you clean and collect the data. I would like to ask how do you transfer this data into "instruction-tuning-like format" ? Thanks.