liuwei-git
Results
2
issues of
liuwei-git
Make phi3 as an explicit model to support in llama.
The only difference between phi3 4k and 128k model is from the rotary embedding. 128k model adds long/short rope scaling factors (freq_factors) and an attn factor to each hidden dimension....
model
review complexity : high