codellama
codellama copied to clipboard
Potentially Incorrect Configs
The new CodeLlama70b Instruct model seem to have incorrect settings for rope_theta
and max_position_embeddings
.
CodeLlama 34b Values:
"rope_theta": 1000000
"max_position_embeddings": 16384
CodeLlama 70b Values:
"rope_theta": 10000
"max_position_embeddings": 2048
This seems to be the case for the HF safetensors version as well as the version downloaded via download.sh
.
https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf/blob/main/config.json
Are these config values correct or should we be able to use the 34b values?
Hi @andrewgross, indeed rope_theta
for 70b-Instruct and 70b-Python is 10,000. max_positioning_embedding
on HuggingFace will be fixed, the correct value is 4096 -- thanks for letting us know!
Great, thanks for the info.
Sorry for the question @jgehring, but if the correct value is 4096, is this still capable of going up to 16k without a lot of issues, or will it show a lot of degradation past 8k?
Thanks for the models in any case!
Is the max_position_embeddings
still 16384 for the regular codellama-70b? And I'm curious why it is lower for the Python and Instruct variants @jgehring, would love some clarity.
It's 16k only for the base (pretrained) Code Llama 70B @michaelroyzen
But why validation of params.json in CodeLlama-70b-Python with "rope_theta": 10000 fails
Checking checksums
consolidated.00.pth: OK
consolidated.01.pth: OK
consolidated.02.pth: OK
consolidated.03.pth: OK
consolidated.04.pth: OK
consolidated.05.pth: OK
consolidated.06.pth: OK
consolidated.07.pth: OK
params.json: FAILED
tokenizer.model: OK
md5sum: WARNING: 1 computed checksum did NOT match
And CodeLlama-70b with "rope_theta": 1000000 passes
Checking checksums
consolidated.00.pth: OK
consolidated.01.pth: OK
consolidated.02.pth: OK
consolidated.03.pth: OK
consolidated.04.pth: OK
consolidated.05.pth: OK
consolidated.06.pth: OK
consolidated.07.pth: OK
params.json: OK
tokenizer.model: OK
?
@jgehring could you share the file expected by checklist.chk(with md5 equal to 2e6de9333f10527d5976d32d8bcddd05)
$ grep params.json CodeLlama-70b*/checklist.chk
CodeLlama-70b-Instruct/checklist.chk:2e6de9333f10527d5976d32d8bcddd05 params.json
CodeLlama-70b-Python/checklist.chk:2e6de9333f10527d5976d32d8bcddd05 params.json
CodeLlama-70b/checklist.chk:a4d42626f8b801baf33c4e9a27fc52e7 params.json
?
Good catch @nazarov-yuriy, this is an error in the checklist.chk
file which we initially provided. This has now been fixed and new downloads will retrieve the corrected checksum file. If you want to verify the params.json
you have on disk for CodeLlama-70b-Instruct and CodeLlama-70b-Python, its MD5 checksum is 184c6afa048cf53e3f8755904556b2cb
.