llama2.rs Some llama2 finetunes don't seem to work

Some llama2 finetunes don't seem to work

Open balisujohn opened this issue 2 years ago • 2 comments

trafficstars

I got https://huggingface.co/TheBloke/Llama-2-13B-GPTQ to work, but using exactly the same strategy for https://huggingface.co/TheBloke/OpenOrca-Platypus2-13B-GPTQ, I get the following error:

RUST_BACKTRACE=1 target/release/llama2_rs -c l13b.act64.bin -t 0.0 -s 25 -p "Hello to all the cool people out there who "  --debug
Configuration: Config { dim: 5120, hidden_dim: 13824, n_layers: 40, n_heads: 40, n_kv_heads: 40, vocab_size: 32000, seq_len: 2048, shared_weight: false }
thread 'main' panicked at src/main.rs:106:9:
assertion `left == right` failed
  left: 8556630020
 right: 8556548100

The offset seems to always be 81920, which is 40*2048 which are both in the constants.rs file for the 13b models, so maybe that's relevant.

Aug 23 '23 07:08 balisujohn

(intuition tells me it's because this is a LoRA finetune)

Aug 23 '23 08:08 balisujohn

Oh weird, for some reason they added 2 additional word tokens. 2 * 5120 * 2 * 4 bytes

I'll take them out for now, and think about a way to handle it better.

Aug 23 '23 13:08 srush

llama2.rs llama2.rs copied to clipboard

Some llama2 finetunes don't seem to work

llama2.rs
llama2.rs copied to clipboard