Zhangyue Yin
Zhangyue Yin
I can't reproduce performer's result in pathfinder32_hard task either. Get just 50.47% best eval result. My training shell script is as follow : `PYTHONPATH="$(pwd)":"$PYTHON_PATH" python lra_benchmarks/image/train.py \ --config=lra_benchmarks/image/configs/pathfinder32/performer_base.py \ --model_dir=./tmp/pathfinder_F...
Thank you for your solutions. It works~ If you want to gain the Training Speed and GPU Memory Advantage, replace eager to "kernels-community/vllm-flash-attn3". Implementation Details: https://huggingface.co/kernels-community/vllm-flash-attn3