Arindam Mitra comments

Results 10 comments of


                                            Arindam Mitra

ETA on pytorch training code?

Hi @VictorSanh , I was wondering if there is any plan to publish the pytorch training code.

Missing dataset config names in P3

It seems that the missing datasets are from the mix t0_eval_score_eval and t0_train_score_eval.

pip install flash-attn --no-build-isolation failing

Tried MAX_JOBS=4. It failed as well. MAX_JOBS=1 timed out after 1h30m.

pip install flash-attn --no-build-isolation failing

@tridao ? Any pointers? Thanks in advance!

pip install flash-attn --no-build-isolation failing

I think there are a wide variety of factors in play here. For me, I could not build the docker with flash attn in A100, but (someone) was able to...

Errors when building the latest flash-attn (2023-07-29) with Ninja

Did you find any solution?

Errors when building the latest flash-attn (2023-07-29) with Ninja

I tried to build the docker in a different machine and It worked.

Method not allowed : curl -X GET "https://sharegpt.com/api/conversations?type=new&page=1&search=python"

+1 This data is really valuable. If you could host a data dump that will be really helpful.

Method not allowed : curl -X GET "https://sharegpt.com/api/conversations?type=new&page=1&search=python"

YI, There is a difference between GPT4all, Alpacca datasets and ShareGPT. ShareGPT is a "multi-turn" dialogue dataset, generated from diverse users. While others are one "single-interaction" between Human and GPT.

Method not allowed : curl -X GET "https://sharegpt.com/api/conversations?type=new&page=1&search=python"

Sharing the data dump is actually better.