Weikuan Wang

Results 2 issues of Weikuan Wang

Hi, I found that the original script cannot handle large models on long context effectively, since it use multiprocess to load an entire model on a single gpu. I also...