llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

llama: use sliding window for phi3

Open FanShupei opened this issue 7 months ago • 0 comments

Related issue report: #7709

This PR switches Phi3 model to use sliding window attention. After this PR, it no longer geneartes broken output after the 2,048 token. Tested on "phi3-mini-4k-instruct" model.

TODO: (DONE) ~~convert_hf_to_gguf.py changes~~

FanShupei avatar Jul 22 '24 09:07 FanShupei