Jay Desai
Jay Desai
Actually, it comes from using bertScore ``` def evaluate(data_loader, model,tokenizer,print_samples=False,metric='bertscore_simple',device=None,**kwargs): """ Compute scores given the predictions and gold labels """ if device is not None: model = model.to(device) inputs,outputs, targets...
The error you mentioned at https://github.com/huggingface/peft/issues/168 is about both training and inference. Do you still have the errors? YES Your latest error is in computing metrics. Do you have no...
Will do, thanks
 Rename to temp.ipynb .
To reproduce, option 2 is to comment out os.environ['CUDA_VISIBLE_DEVICES']='0' line in this example https://github.com/huggingface/peft/blob/main/examples/int8_training/Finetune_flan_t5_large_bnb_peft.ipynb and run it on multi-gpu instance.
@dukesun99 @pacman100 is there something like inference_mode that has to be turned off or anything missing here which is causing this behavior ?
will try it out, thanks
Did you guys start training cerebras gpt? any checkpoints that you can share ?
raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ========================================================= ./run.py FAILED --------------------------------------------------------- Failures: --------------------------------------------------------- Root Cause (first observed failure): [0]: time : 2023-09-22_18:11:04 host : xxx rank : 0 (local_rank: 0) exitcode : -11 (pid:...