DeepSeek-Coder-V2
DeepSeek-Coder-V2 copied to clipboard
How to build a fine-tuning dataset for code completion?
I want to implement code completion based on the company's self-developed component source code fine-tuning model. How should I build the dataset? Is instruction based dialogue generation code built in this form? { "input":"#write a quick sort algorithm" "output":"your quick sort algorithm code" } How to build a dataset based on code Insertion?(FIM)