DeepSeek-Coder-V2 icon indicating copy to clipboard operation
DeepSeek-Coder-V2 copied to clipboard

How to build a fine-tuning dataset for code completion?

Open FWLamb opened this issue 1 year ago • 0 comments

I want to implement code completion based on the company's self-developed component source code fine-tuning model. How should I build the dataset? Is instruction based dialogue generation code built in this form? { "input":"#write a quick sort algorithm" "output":"your quick sort algorithm code" } How to build a dataset based on code Insertion?(FIM)

FWLamb avatar Jul 23 '24 14:07 FWLamb