Arindam Mitra
Arindam Mitra
Hi @VictorSanh , I was wondering if there is any plan to publish the pytorch training code.
It seems that the missing datasets are from the mix t0_eval_score_eval and t0_train_score_eval.
Tried MAX_JOBS=4. It failed as well. MAX_JOBS=1 timed out after 1h30m.
@tridao ? Any pointers? Thanks in advance!
I think there are a wide variety of factors in play here. For me, I could not build the docker with flash attn in A100, but (someone) was able to...
Did you find any solution?
I tried to build the docker in a different machine and It worked.
+1 This data is really valuable. If you could host a data dump that will be really helpful.
YI, There is a difference between GPT4all, Alpacca datasets and ShareGPT. ShareGPT is a "multi-turn" dialogue dataset, generated from diverse users. While others are one "single-interaction" between Human and GPT.
Sharing the data dump is actually better.