llama2.rs
llama2.rs copied to clipboard
Some llama2 finetunes don't seem to work
trafficstars
I got https://huggingface.co/TheBloke/Llama-2-13B-GPTQ to work, but using exactly the same strategy for https://huggingface.co/TheBloke/OpenOrca-Platypus2-13B-GPTQ, I get the following error:
RUST_BACKTRACE=1 target/release/llama2_rs -c l13b.act64.bin -t 0.0 -s 25 -p "Hello to all the cool people out there who " --debug
Configuration: Config { dim: 5120, hidden_dim: 13824, n_layers: 40, n_heads: 40, n_kv_heads: 40, vocab_size: 32000, seq_len: 2048, shared_weight: false }
thread 'main' panicked at src/main.rs:106:9:
assertion `left == right` failed
left: 8556630020
right: 8556548100
The offset seems to always be 81920, which is 40*2048 which are both in the constants.rs file for the 13b models, so maybe that's relevant.
(intuition tells me it's because this is a LoRA finetune)
Oh weird, for some reason they added 2 additional word tokens. 2 * 5120 * 2 * 4 bytes
I'll take them out for now, and think about a way to handle it better.