joydchh

Results 5 comments of joydchh

> We train this model on 8x A100 80GB GPUs. I'll update the README. > > > I... submit a request for a mini model to do sanity checks on...

> Anyone know how to issue this exception? ![image](https://user-images.githubusercontent.com/62139204/226175117-4c0bb2e4-5316-4f92-84db-8e2d0c2f3be5.png) > > I have tried use_new_zipfile_serialization=False, but it doesn't work: ![image](https://user-images.githubusercontent.com/62139204/226175299-439d8c26-28b3-4aa1-a131-0ee960268aad.png) did you find some way to fix this?

i found the problem is because of the corrupt tmp weights file. you can check if there is something simliar. Delete the related 3 files, and execute prepare.py again to...

> There should be log messages during training. I feel the rank 0 was down so the other three were waiting for it. Can you post the full log message?...

> @joydchh The rank 0 crashed when it tried to read the dataset. Can you check if all data files are prepared in "/data/OpenChatKit/training/../data/OIG/files/"? > > And I noticed you...