Ryusaeba comments

Results 19 comments of


                                            Ryusaeba

[Usage] tokenization mismatch when finetuning v1.5-7b

@haotian-liu This is similar issue as what FastChat meet. The root cause is Huggingface introduce some bugs when dealing with added tokens. Please refer the fix [here](https://github.com/lm-sys/FastChat/pull/2498).

Less is more for alignment (LIMA) - adding special EOT token

@kperi Did you happen to have the 1000 LIMA training data?

openchat3.5 training data formatting

Are you using 0.1 of weight for the data with unknown correctness and 1.0 of weight for correct one? If not, could you please reveal more details?

Question about Llama-7B and Llama-7B-Pro comparison.

Understood. Please share with me if you have any update. Also looking forward your expansion to Mistral and Multi-Modal models.

Multi-GPU setup: indices should be either on cpu or on the same device as the indexed tensor (cuda:1)

@SunMarc, Will be there has a patch release for v4.45 series?

Multi-GPU setup: indices should be either on cpu or on the same device as the indexed tensor (cuda:1)

Thank you @SunMarc. I tried with v4.45.2 and the issue still persist. Will give it a try with latest transfromers.

Multi-GPU setup: indices should be either on cpu or on the same device as the indexed tensor (cuda:1)

The issue was happened at Gemma-2. I will see whether we can prepare a scripts for reproducible.

Multi-GPU setup: indices should be either on cpu or on the same device as the indexed tensor (cuda:1)

@SunMarc The issue is still persist. Please see the following code and help on this issue. **CODE** ``` from transformers import ( AutoModelForCausalLM, AutoTokenizer, ) import pdb MODEL_PATH='/llm_data2/huggingface/models/google/git_version/gemma-2-2b-it' texts =...

Multi-GPU setup: indices should be either on cpu or on the same device as the indexed tensor (cuda:1)

@SunMarc Understood. Thanks for the workaround. Will give it a try. We are using [run_clm.py](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py) for finetuning experiments, it would be great if the transformers library can integrate the fix....