陈越 (Chen Yue)
陈越 (Chen Yue)
facing the same issue. my code works fine with flan-t5 but raises this error with t5-base ``` Traceback (most recent call last): File "run_question_answering.py", line 898, in main() File "/home/ychen/anaconda3/envs/py38/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py",...
> facing the same issue. > > my code works fine with flan-t5 but raises this error with t5-base > > ``` > Traceback (most recent call last): > File...
我遇到同样问题,感觉像是这个参数被find_unused_parameters机制认为是unused参数,但实际上又获得了梯度
Yes, I set os.environ['HF_EVALUATE_OFFLINE'] = '1' but loading meteor is still quite slow. Seems it takes long to verify the nltk-data is up-to-date
> I also can't seem to import Llama 3.1 models that I downloaded from Hugging Face, not sure if it's related? > > ``` > [[email protected] demo-llm]$ docker exec -it...
> 感谢回复,但是我试着只使用最末尾的1k token进行推理,精度和使用全量数据接近,这不合理啊。 很好奇哪些task上会有这种情况,可以说一下嘛