codellama icon indicating copy to clipboard operation
codellama copied to clipboard

Potentially Incorrect Configs

Open andrewgross opened this issue 4 months ago • 8 comments

The new CodeLlama70b Instruct model seem to have incorrect settings for rope_theta and max_position_embeddings.

CodeLlama 34b Values:

"rope_theta": 1000000
"max_position_embeddings": 16384

CodeLlama 70b Values:

"rope_theta": 10000
"max_position_embeddings": 2048

This seems to be the case for the HF safetensors version as well as the version downloaded via download.sh.

https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf/blob/main/config.json

Are these config values correct or should we be able to use the 34b values?

andrewgross avatar Jan 29 '24 19:01 andrewgross

Hi @andrewgross, indeed rope_theta for 70b-Instruct and 70b-Python is 10,000. max_positioning_embedding on HuggingFace will be fixed, the correct value is 4096 -- thanks for letting us know!

jgehring avatar Jan 29 '24 21:01 jgehring

Great, thanks for the info.

andrewgross avatar Jan 29 '24 21:01 andrewgross

Sorry for the question @jgehring, but if the correct value is 4096, is this still capable of going up to 16k without a lot of issues, or will it show a lot of degradation past 8k?

Thanks for the models in any case!

LavaPlanetLLM avatar Jan 29 '24 21:01 LavaPlanetLLM

Is the max_position_embeddings still 16384 for the regular codellama-70b? And I'm curious why it is lower for the Python and Instruct variants @jgehring, would love some clarity.

michaelroyzen avatar Jan 30 '24 02:01 michaelroyzen

It's 16k only for the base (pretrained) Code Llama 70B @michaelroyzen

syhw avatar Jan 30 '24 10:01 syhw

But why validation of params.json in CodeLlama-70b-Python with "rope_theta": 10000 fails

Checking checksums
consolidated.00.pth: OK
consolidated.01.pth: OK
consolidated.02.pth: OK
consolidated.03.pth: OK
consolidated.04.pth: OK
consolidated.05.pth: OK
consolidated.06.pth: OK
consolidated.07.pth: OK
params.json: FAILED
tokenizer.model: OK
md5sum: WARNING: 1 computed checksum did NOT match

And CodeLlama-70b with "rope_theta": 1000000 passes

Checking checksums
consolidated.00.pth: OK
consolidated.01.pth: OK
consolidated.02.pth: OK
consolidated.03.pth: OK
consolidated.04.pth: OK
consolidated.05.pth: OK
consolidated.06.pth: OK
consolidated.07.pth: OK
params.json: OK
tokenizer.model: OK

?

nazarov-yuriy avatar Jan 30 '24 14:01 nazarov-yuriy

@jgehring could you share the file expected by checklist.chk(with md5 equal to 2e6de9333f10527d5976d32d8bcddd05)

$ grep params.json CodeLlama-70b*/checklist.chk
CodeLlama-70b-Instruct/checklist.chk:2e6de9333f10527d5976d32d8bcddd05  params.json
CodeLlama-70b-Python/checklist.chk:2e6de9333f10527d5976d32d8bcddd05  params.json
CodeLlama-70b/checklist.chk:a4d42626f8b801baf33c4e9a27fc52e7  params.json

?

nazarov-yuriy avatar Jan 30 '24 17:01 nazarov-yuriy

Good catch @nazarov-yuriy, this is an error in the checklist.chk file which we initially provided. This has now been fixed and new downloads will retrieve the corrected checksum file. If you want to verify the params.json you have on disk for CodeLlama-70b-Instruct and CodeLlama-70b-Python, its MD5 checksum is 184c6afa048cf53e3f8755904556b2cb.

jgehring avatar Jan 30 '24 21:01 jgehring