starcoder2 icon indicating copy to clipboard operation
starcoder2 copied to clipboard

format for inference in code completion

Open ARIELDENG opened this issue 1 year ago • 6 comments
trafficstars

starcoder's format for inference in code completion is PSM, <fim_prefix> + prefix + <fim_suffix> + suffix + <fim_middle>

what's that for starcoder2?

from the paper, we could only see that image

ARIELDENG avatar Mar 04 '24 10:03 ARIELDENG

I have the same question. I want to complete the code through perfect code requirements, but the model cannot stop. Can you give me a perfect prompt format?

xcxhy avatar Mar 04 '24 16:03 xcxhy

It's the same as StarCoder, we apply FIM inside each file regardless of the repo structure(the filepath in the beginning of a file is optional) so you can do

<fim_prefix>prefix<fim_suffix>suffix<fim_middle>

loubnabnl avatar Mar 04 '24 19:03 loubnabnl

thanks for your attention, but the thing is that the output won't stop when I apply this formatting, just like you @xcxhy image However, it seems to be following a pattern as shown in the picture above, so you can fix it by @xcxhy image

ARIELDENG avatar Mar 05 '24 03:03 ARIELDENG

yes <file_sep> is the token we use to separate files so you can use it as a stop token. The <|endoftext|> token was used to separate repositories since we now concatenate files from the same repo in one sample.

loubnabnl avatar Mar 05 '24 15:03 loubnabnl

yes <file_sep> is the token we use to separate files so you can use it as a stop token. The <|endoftext|> token was used to separate repositories since we now concatenate files from the same repo in one sample.

Thank you so much, and the StarCoder series are really amazing! Recently I've been using them for SFT to better apply to our users' habits and witnessed great improvement.

ARIELDENG avatar Mar 05 '24 15:03 ARIELDENG

When using ollama, all you need to do is set <file_sep> as stop sequence, when using https://github.com/huggingface/llm-vscode it would be:

  "llm.requestBody": {
    "stream": true,
    "options": {
      "stop": [
        "<file_sep>"
      ],
      "temperature": 0,
    }
  },

robertpiosik avatar Jul 08 '24 13:07 robertpiosik