Kingsley
Kingsley
Can you check that your cutoff_len is big enough?
Can we retry with disabled deepspeed? As I observed, peak GPU memory consumption is about 25 GB with bs=4. dataset: your demo data repeated twice. configs ```yaml ### model model_name_or_path:...
Indeed, I have reproduced your case with only DeepSpeed ZeRO Stage 3. Works fine in full fine-tune with Zero3.
Can you provide the training script? @humble-gambler And what predictions on the training set look like?
Emmm, the lr is fine. Q1: Can you show the predictions on a part of the training set? It is an easy classification task. As we can see, the loss...
You can save several Lora adapters, then do prediction after training. If we do not add an extra prompt like "You should output the emotion label by using the following...
Thanks for reporting this. I think something went wrong. @Luffy-ZY-Wang Hi, have you encountered this issue in your case?
> > Thanks for reporting this. I think something went wrong. [@Luffy-ZY-Wang](https://github.com/Luffy-ZY-Wang) Hi, have you encountered this issue in your case? > > TBH, I didn't encounter this issue in...
I suppose it's not the point. USE_AUDIO_IN_VIDEO is only triggered by `use_audio_in_video and len(audios) and len(videos)`.