Forkoz

Results 474 comments of Forkoz

It built but I have no way to select what GPU to use.

Bug in hip or rocm. On nvidia it's working to split. Other bug is OOM if you can't properly dispatch the model so it doesn't run out during inference.

What will it do with how it is in textgen now? ``` if shared.args.alpha_value > 1 or shared.args.rope_freq_base > 0: config.alpha_value = RoPE.get_alpha_value(shared.args.alpha_value, shared.args.rope_freq_base) config.calculate_rotary_embedding_base() ``` Will theta be overwritten...

So with alpha it will need to be changed to set the base to 10000.0 and then with apply it. And if theta is specified just apply it directly.

Right, but I ran perplexity tests on codellama 34b + airoboros 2.1 lora, the numbers are lower by more than a whole point when you use alpha of ~2.7 than...

I'm not either. That's a lot of gymnastics but most people *usually* want to run at spec. It's also probably why we aren't seeing so many FT for the 34b,...

I think V2 is in the works. Not sure if it will have support for P40 but then again, you have llama.cpp that is all FP32 and I can run...

It's exactly the same as alpha. BTW the "base" for codellama base is about alpha 100.

I don't think it runs on metal. Only AMD/Nvidia so far.

Unsupported tensor Dtype. Loading the guanaco lora above and llama-70b using exllama_hf. I will try without fused attention.