Alexey Mametyev comments

Results 14 comments of


                                            Alexey Mametyev

trafficstars

Problem with installation

The same problem(( ``` %pip install "unsloth[cu121amperetorch220] @ git+https://github.com/unslothai/unsloth.git" Defaulting to user installation because normal site-packages is not writeable WARNING: Ignoring invalid distribution -orch (/home/jupyter/.local/lib/python3.10/site-packages) Collecting unsloth@ git+https://github.com/unslothai/unsloth.git (from unsloth[cu121amperetorch220]@...

Run without quantization

I'll use Tesla A100 with 80 gb vram + 512 ram

Run without quantization

> Seems like you'll be a little bit short on VRAM. Full fp16 model requires ~87GB. The table is taken from our [tech report](https://arxiv.org/pdf/2312.17238.pdf). > > ![image](https://github.com/dvmazur/mixtral-offloading/assets/43727641/735e6ec2-ee72-4e18-b9e5-939c54501671) I'll unload some...

Run without quantization

> If you decide to go down that path, I can help you out a bit in this issue :) Thanks, I’d appreciate your help with this. Also i 'll...

Run without quantization

I've tried to rewrite your code to add a fp16 support using your tips, but i faced some difficulties: i don't understand where exactly in replace_layer_storage we use quantization? As...

train_on_inputs

Llm used to predict next token (word) based on previous history. To train such model we should give it a lot of texts, it will be divided into tokens (words)....

test_size=2000 should be either positive and smaller than the number of samples 1 or a float in the (0, 1) range

You have problem with your dataset. It should contain at least 2000 items because test size = 2000, but have only one item.

What are the minimum system requirements?

To finetune 7b qlora you need at least 16GB GPU RAM. So you need something like 3090 or greater. To finetune 3b models you can use google colab with no...

trainable params: 0

Same problem, LoRA adapters does has no grad, inspite of is_trainable=True, ```python from peft import prepare_model_for_kbit_training, LoraConfig, get_peft_model from transformers import LlamaForCausalLM, LlamaTokenizer import torch load_in_8bit = True model =...

trainable params: 0

config = LoraConfig.from_pretrained('path', is_trainable=True, torch_dtype=torch.float16, device_map={'': 0} ) config.inference_mode = False this helps me