winca
winca
> Hi! I can not to reproduce the error, on our H100 machine, I can run ` torchrun --rdzv-endpoint=localhost:0 --rdzv-id=111223 --nnodes 1 --nproc_per_node 8 --rdzv-backend=c10d recipes/finetuning/finetuning.py --enable_fsdp --dataset alpaca_dataset --model_name...
Just the origial "meta-llama/Meta-Llama-3-8B-Instruct" or "meta-llama/Meta-Llama-3-8B" can not be used, with the message "Meta-Llama-3-8B does not appear to have a file named config.json" . "Llama3/Meta-Llama-3-8B-Instruct-hg" is transformerd as hugging-face format...
> I encountered the "8B does not appear to have a file named config.json" before and I think I solved it by a completely reinstall on the latest main: `pip...
OpenGVLab/InternVL-Chat-V1-5-Int8使用此版本也是同样的问题,求修复
> > 您好 请问1.5现在已经支持了吗 > > 支持了,现在用的话,可以自己编译,或者等我们5月8号发版 怎么跑在多卡上呢,单卡显存容量不够,提示oom了
> I noticed that the error is from evaluation function and I can reproduce your error now. The problem is the the [eval for loop never got entered](https://github.com/meta-llama/llama-recipes/blob/44b66374bec23ad77c00af4348197e6641a8d2e3/src/llama_recipes/utils/train_utils.py#L337) as len(eval_dataloader)...