Frank Qi

Results 11 comments of Frank Qi

> You have to downgrade your version of the peft package to 0.2.0 for now. The new version is broken. See #1253 I'm facing the same issue. Just notice that...

> yes. I have encountered this problem for TWICE. Each time I didn't do any other changes at all, just installed peft==0.2.0 from 0.3.0.dev. And for that reason, I have...

yes, for now, if use a lower version of peft, it will report the same error as you. But if upgrade peft to latest version(0.3.0.dev0 for now), I can't save...

> New idea: Now the training finally works. Setting fp16=False would make the training be super slow and not mem-friendly. > > To avoid "ValueError: Attempting to unscale FP16 gradients",...

my config: model size: 7b datasets: 10k rows of instructions device: 8xa100-80g, with 50 cores of cpu and 500 gb of ram. Also KILLED. Damn...so sad

> > > my config: model size: 7b datasets: 10k rows of instructions device: 8xa100-80g, with 50 cores of cpu and 500 gb of ram. > > Also KILLED. Damn...so...

I found that it will be slow if first load. But for me, it will finish within minutes. Or you can check if your cuda environment works as normal. BTW,...

> > I found that it will be slow if first load. But for me, it will finish within minutes. Or you can check if your cuda environment works as...

> Sorry to bother. The transformers will load model in float32 in default. Users have to set load type when loading or do half() to obtain a float16 model(in this...

same problem. My generation config: do_sample=True, beams>1(using beam_sample) It turns to be normal when tweak beams=1(using sample). Didn't try if tweak do_sample to False, but it seems working in others'...