ZipXuan
Results
1
comments of
ZipXuan
This chart in [llama3 paper](https://arxiv.org/abs/2407.21783) has something wrong. The key/value cache head number for 405B model is 16 rather than 8. You can find the answer in this [link](https://www.reddit.com/r/LocalLLaMA/comments/1eoin62/meta_just_pushed_a_new_llama_31_405b_to_hf/?rdt=50225)