Daya Guo
Daya Guo
measured by len(code) function.
Hello, there are currently no plans to open-source the pre-training code.
Thank you for the reminder. Initially, when training the coder, in order to replicate Starcoder, we used their filtering method, which resulted in the loss of these important languages. We...
The deepseek-coder-v2 236B model was not intended for code completion, so FIM (Fill-in-the-Middle) was not used.
V2 switched from topological sorting concatenation to random concatenation mainly because random concatenation is more language-friendly.
1. Yes, because we believe that randomly concatenating files is more reflective of real-world scenarios. Programmers may not always write dependent functions or classes first; they might complete the logic...
Dependencies between files are important. When randomly concatenating files, some cases may satisfy these dependencies and can help improve the performance in very long codes. Other cases may contribute to...
> so remove file topological graph is not benificial inall? is this confirmed by experiments? the key point of my question is shuffled file has more halluciation case than dependpent...
> > > so remove file topological graph is not benificial inall? is this confirmed by experiments? the key point of my question is shuffled file has more halluciation case...