pawkanarek comments

Results 31 comments of


                                            pawkanarek

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

I used the `trainer.save_pretrained` function mentioned in PR https://github.com/huggingface/transformers/pull/29388 but it didn't change anything - trained model after saving is still excactly the same as before training.

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

I think that i fixed it, but i won't recommend this fix to anyone, so I'm not even thinking about making PR. It's a patch rather than fix, but i...

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

@shub-kris thanks, > @PawKanarek just to isolate the error, what happens if you run the same code on a GPU instead of TPU? I don't have GPU capable of training...

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

@moficodes I think you did misunderstand my intentions. I want to save a standalone model, not just the LoRA adapter. You saved only the LoRA adapter (with `trainer.save_model()`), but I...

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

Thank you @shub-kris ! I will run this script on my local machine and then I will share the results. I have one question regarding to your code, why do...

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

I think that my original method for comparing weights was broken. When I accessed the parameters with the `params1 = model1.parameters()` Then the method returns iterator function, and it will...

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

> Is this happening when you're loading a saved model? @amyeroberts No, I copied that warning message from comment of @zorrofox https://github.com/huggingface/transformers/issues/29659#issuecomment-2007343622, but I remember that i also experienced this...

Problems with saving standalone gemma-2b-it after fine-tuning with LoRA on TPU v3-8

@shub-kris with commented-out FSDP and reduced `batch_size=1` i could finally spot a really fine-tuned model without a warnings. output (click arrow to expand) ``` (v_xla) raix@t1v-n-3a1a9ef8-w-0:~/minefinetune$ cd /home/raix/minefinetune ; /usr/bin/env...

Cannot run on v4-16 worker 0 TPU VM: "Failed to get global TPU topology"

Hi @michaelmoynihan, I also have the `Failed to get global TPU topology` on tpu v4-8, so I followed your advice: > What I would recommend first is trying `us-central1-docker.pkg.dev/tpu-pytorch-releases/docker/xla:nightly_3.8_tpuvm_20240226` and...

Cannot run on v4-16 worker 0 TPU VM: "Failed to get global TPU topology"

I tried to run this script on tpu v3-8 and with slight modifications of the script (I lowered the model to Gemma-2b - because of resource_exhausted bug) could start my...