starcoder2
starcoder2 copied to clipboard
support SPM mode for FIM prompts
trafficstars
from fim paper (https://arxiv.org/pdf/2207.14255.pdf) section 3.1: SPM mode can be used to reuse kv cache across completion requests.
SPM modes can enable further latency optimization (which is very important in case of code completion tools). is there any reason that startcoder models are using normal PSM mode?