CoLLiE icon indicating copy to clipboard operation
CoLLiE copied to clipboard

Support for LLaMA-2 70B with Grouped-Query Attention

Open kaiwang13 opened this issue 11 months ago • 18 comments

Due to the Grouped-Query Attention introduced in LLaMA-2 70B,llama issue,it cannot be loaded into the collie implementation of LLaMA. Hope LLaMA-2 70B can be support in collie. Thanks

Traceback (most recent call last):
  File "/nvme1/gptdata/share1/projects/collie/examples/download.py", line 49, in <module>
    model = LlamaForCausalLM.from_pretrained(model_name, config=config)
  File "/nvme1/gptdata/share1/app/mambaforge/envs/collie/lib/python3.9/site-packages/collie/models/base.py", line 306, in from_pretrained
    state_dict = cls.load_parallel_state_dict(
  File "/nvme1/gptdata/share1/app/mambaforge/envs/collie/lib/python3.9/site-packages/collie/models/llama/model.py", line 414, in load_parallel_state_dict
    part_state_dict[key] = rearrange(
RuntimeError: shape '[8192, 8192]' is invalid for input of size 8388608

kaiwang13 avatar Jul 21 '23 16:07 kaiwang13