Arindam Mitra

Results 10 comments of Arindam Mitra

Hi @VictorSanh , I was wondering if there is any plan to publish the pytorch training code.

It seems that the missing datasets are from the mix t0_eval_score_eval and t0_train_score_eval.

Tried MAX_JOBS=4. It failed as well. MAX_JOBS=1 timed out after 1h30m.

@tridao ? Any pointers? Thanks in advance!

I think there are a wide variety of factors in play here. For me, I could not build the docker with flash attn in A100, but (someone) was able to...

I tried to build the docker in a different machine and It worked.

+1 This data is really valuable. If you could host a data dump that will be really helpful.

YI, There is a difference between GPT4all, Alpacca datasets and ShareGPT. ShareGPT is a "multi-turn" dialogue dataset, generated from diverse users. While others are one "single-interaction" between Human and GPT.