Daya Guo
Daya Guo
> > Regarding repo-level concatenation, I have a related question. > > In a batch, one sample may contain multi docs from different files, such as repo_a/file_a and repo_a/file_b. When...
In fact, special token is required. However, we incorporate comments such as `#utils.py` and `#model.py` before each file to indicate to the model that the code completion is at the...
> > > > Regarding repo-level concatenation, I have a related question. > > > > In a batch, one sample may contain multi docs from different files, such as...
新的transformers版本更新了接口,请确保你本地的deepseek-coder repo是最新的。 https://github.com/deepseek-ai/DeepSeek-Coder/commit/6590983bf05aa05fe61a9360f5d50360ad84980f
You can try this example,and the instruct model should be used for code completion. https://github.com/deepseek-ai/DeepSeek-Coder#1-code-completion
Instruct model can't support FIM because we don't use FIM task during fine-tuning.
> > Instruct model can't support FIM because we don't use FIM task during fine-tuning. > > why not use fim format during finetuning? if not use, how to get...
一致的,所以你可以使用自己的数据继续预训练
已有社区开源了量化后的模型:https://huggingface.co/TheBloke/deepseek-coder-33B-instruct-GPTQ