LlamaGen
LlamaGen copied to clipboard
Issues about the 3B model
Thanks for your fascinating work!
I'm now trying on the 3B model and encountered two issues:
- The json of 3B model is missing. I tried to modify from the json of the XXL version to match the checkpoint and statistics in the paper, but meet another issue;
ValueError: Head size 100 is not supported by PagedAttention. Supported head sizes are: [64, 80, 96, 112, 128, 256].from xformers.
Hello, @Con6924
This issue has been fixed in https://github.com/FoundationVision/LlamaGen/pull/23. Please have a try with main branch.