juanps90

Results 12 comments of juanps90

> FYI: I just submitted this pull request to integrate llama.cpp into langchain: #2242 Thank you very much!! Do you think it would be possible to run LLaMA on GPU...

I'm interested in finetuning as well. Does anyone have any recommendation for this?

Are you able to train 7B using dual RTX3090's? Do you think you could setup a notebook on Colab? Thank you!!!!!!

This appears to be related to CodeLLaMA34B as its 13B variant works with LoRA and about 13K context (haven't tried more).

I am using Neko-Institute-of-Science_LLaMA-30B-4bit-128g with no context scaling training at all. As I understand, NTK RoPE Scaling does not require any finetuning at all, unlike SuperHOT. Am I setting the...

> I think you need to call `config.calculate_rotary_embedding_base()` with the current way RoPE NTK scaling is implemented for the settings to properly take effect. Make sure `config.alpha_value` is already set...

I'm having a weird issue where it just skips or adds digits to numbers. For example, if there's a phone number in the prompt, the generated text may add another...

> I've seen that effect while running a linear-scaled LoRA (SuperHOT or Airoboros 8k or 16k) with the wrong compress_pos_emb value. If it's set to anything other than what it...

Well, LLaMA v2 13B GPTQ from The-Bloke goes NUTS after I do: ``` config = ExLlamaConfig(model_config_path) # create config from config.json config.model_path = model_path # supply path to model weights...