icetk icon indicating copy to clipboard operation
icetk copied to clipboard

Tokenizer cant be hashed when using datastes.map function

Open dumpmemory opened this issue 1 year ago • 2 comments

the tokenizer cant be hashed when using datasets.map function with num_proc >1 .

https://github.com/THUDM/ChatGLM-6B/issues/286

dumpmemory avatar Mar 31 '23 08:03 dumpmemory

same problem, can anyone help to solve this?

danyang-rainbow avatar Apr 09 '23 15:04 danyang-rainbow

https://stackoverflow.com/questions/55344376/how-to-import-protobuf-module Seems like protobuf is not picklable. I will look into it in next few days.

Sleepychord avatar Apr 10 '23 02:04 Sleepychord