Finetune_LLMs icon indicating copy to clipboard operation
Finetune_LLMs copied to clipboard

Repo for fine-tuning Casual LLMs

Results 5 Finetune_LLMs issues
Sort by recently updated
recently updated
newest added

Using --model_name_or_path hivemind/gpt-j-6B-8bit RuntimeError: The expanded size of the tensor (50257) must match the existing size (0) at non-singleton dimension 0. Target sizes: [50257]. Tensor sizes: [0]

I can't figure out how to fix this error. I am trying to run the example run.txt from here https://github.com/mallorbc/Finetune_GPTNEO_GPTJ6B/blob/main/finetuning_repo/example_run.txt I run it and get this error, it has an...

In your example_run.txt command line example for deepspeed, should "--block_size 2048" perhaps be set? Without this, it looks like it's picking up the GPT2 default of 1024, but GPT-J rather...

Hello, I am trying to finetune GPT-j-6b. I followed the instructions provided in the documentation. But, I get this error. I tried by changing batch size =1, gradient_accumulation_steps=4. Any idea...

![image](https://user-images.githubusercontent.com/47894192/236276560-049a0013-0937-4891-a433-1bd61f5863a1.png) Getting gradient overflow and skipped step every 2 or so steps. Training the 13b llama model on 7 a100s with context window of 512. Below is the command line...