Weikuan Wang
Results
2
issues of
Weikuan Wang
---train=True should be --train=True ?
Hi, I found that the original script cannot handle large models on long context effectively, since it use multiprocess to load an entire model on a single gpu. I also...