Charles Srisuwananukorn comments

Results 56 comments of


                                            Charles Srisuwananukorn

Build a docker image for openchatkit

As I mentioned in the PR, both the pretrained model and datasets can be quite large. ``` $ du -sh data/* pretrained/GPT-NeoX-20B/ 172G data/OIG 238M data/OIG-moderation 38G data/wikipedia-3sentence-level-retrieval-index 39G pretrained/GPT-NeoX-20B/...

What kind of data to feed to the model ?

If you're trying to reproduce the `GPT-NeoXT-Chat-Base-20B` model, you can download the dataset by running `python data/OIG/prepare.py` from the root of the repository. We plan to add more documentation about...

Resources required to replicate #openchatkit

To reproduce `GPT-NeoXT-Chat-Base-20B` requires quite a lot of resources. 1. You'll need around 1TB of disk space. The datasets take about 200GB. ``` $ du -hs data/* 172G data/OIG 238M...

can not create conda environment

> Would be nice if the README can add the prerequisites for setting up the environment. I'll update the README. I believe these packages are only available on Linux. Windows...

Exception in subprocess.py

Or it could also be your git configuration. Could you let me know if this command works (as @orangetin suggested)? ``` git clone https://huggingface.co/datasets/laion/OIG /www/wwwroot/OpenChatKit/data/OIG/files ```

togethercomputer/OpenChatKit.md

Can you tell me more about what this patch does?

add gradio support

Thanks for the PR! I'll be taking a look at this soon.

add gradio support

Thanks! Will look.

${DIR}/../data/OIG/files/unified_ni.jsonl:0.2

Thanks, @LorrinWWW. Let's add this to the training README?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 432.00 MiB (GPU 2; 23.65 GiB total capacity; 20.88 GiB already allocated; 259.56 MiB free;

> It'd be really cool if the minimum requirements of the model (size on disk for data set, vram requirements) on the readme, that would save a lot of people...