orangetin

Results 60 comments of orangetin

> @bonswouar FYI the specs listed in that issue are for the GPT-NeoXT-20B model. OCK also supports the Pythia-7B model which can run on as little as 9 GB VRAM....

This isn't an integration issue like pacman100 said. See this: https://github.com/microsoft/DeepSpeed/issues/1846 Looks like an issue with the DeepSpeed pip package, I recommend installing it via conda

I can repro this so let me know if you need more logs. I'm trying to debug this myself too.

Thanks @ekkkkki for the context. I must have missed this. @muellerzr is this enough to go on or would you like more details?

File output ```shell Traceback (most recent call last): File "/mnt/camelot/Team4/fall23/llama2/Book/OpenChatKit-main/training/dist_clm_train.py", line 478, in main() File "/mnt/camelot/Team4/fall23/llama2/Book/OpenChatKit-main/training/dist_clm_train.py", line 397, in main init_communicators(args) File "/mnt/camelot/Team4/fall23/llama2/Book/OpenChatKit-main/training/comm/comm_utils.py", line 103, in init_communicators _PIPELINE_PARALLEL_COMM = NCCLCommunicator(_PIPELINE_PARALLEL_RANK,...

> Encryption with ECC is not supported right now. Is it planned?

see [this comment](https://github.com/togethercomputer/OpenChatKit/issues/45#issuecomment-1560404201) to setup the env for Mac

> yeah! It's 40 GB, but I have 8 of them. Can I use them together to avoid this issue? > > The problem occurs after loading both model and...

Close this issue and open a new one for gpu-mem accumulation? I believe this is solved.

> So I also think something might have gone wrong before the gpu was called @nickvazz @sherlockzym You're right. The model never got to your GPU, as @lokikl points out,...