Jeremy Cochoy

Results 8 comments of Jeremy Cochoy

> @kossnick Can you please try converting the model using the change in PR > #524 ? I Had exactly this problem with my "home made" model. Forcing the rank...

May be the source of the problem. But I don't know if the previous implementation (`rank = len(graph.shape_dict[node.inputs[0]])`) would work with the current code base. Unfortunately I don't have the...

@breizhn Unfortunately I got a little overwhelmed by work and didn't progressed on the fft/ifft operators PR. I should definitively resume this. But this is only for the specification part....

Thats indeed what I have done but this seams to be insufficient to run the original LORA configuration. I was able to reproduce the original lora training from the original...

Thanks. I will have a look this evening and keep you updated 👍

I tried the last head. The code do seams to run (i.e. what I got when I copy pasted the missing functions into the file) however I imediately get an...

> If there is only one gpu, maybe you can directly run `train_lora.py` without FSDP(in case it's FS-Data-Parallel). Besides, as mentioned [here](https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train_lora.py#L97-L101), gradient checkpointing with LoRA needs a monkey patch...

I just tried to compile it right now on an ubuntu VM, and didn't get any error. But I noticed that the README.md was unclear. Did you created a `build`...