Charles Srisuwananukorn

Results 56 comments of Charles Srisuwananukorn

@DanFu09 can you take a look?

Actually, @mauriceweber, can you take a look?

We train this model on 8x A100 80GB GPUs. I'll update the README. > I... submit a request for a mini model to do sanity checks on local systems and...

About an hour per 100 steps. Usually, we fine-tune for a couple days.

Thank you for the PR, @shirayu! This looks great. I'd like to review a couple things tomorrow before merging. Please stay tuned.

After some research, many projects seem to be recommending `mamba` for faster installation (see [this article](https://pythonspeed.com/articles/faster-conda-install/) for more details). I just tested it, and it does seem much faster. Installing...

Training ran fine. I'll update the README to suggest using mamba.

I believe it also does not work on macOS. These packages require NVIDIA GPUs, which most Macs do not have.

@LorrinWWW, this is a version of your script for sharding the base model. Could you please take a look?

I've seen this issue when running out of GPU RAM. Unfortunately, the model requires an A100 80GB right now. Are you using an A100 40GB?