Results 3 issues of XinliYu

I found the special treatment of `citeseer` dataset in the `load_data` function is unnecessary and confusing. The whole function can be simplified as the following, which could also possibly better...

We fine-tuned Alpaca on one single node with torchrun, and on multiple nodes with DeepSpeed. I am following the "demo" parameters > temperature=0.7 top_p=0.9 do_sample=True num_beams=1 max_new_tokens=600 We evaluate the...

Dolly deepspeed fine-tuning works on a single GPU node but it hangs on two GPU nodes. Here are the commandlines I am using. When using a single node, simply set...