Results 10 issues of matrix

创建预训练的数据时, article['input_ids'].append(0),最后时0,但是在demo里eos确是102,如果在预训练数据中没有102,那结束字符很难会出现102([sep])字符吧,但是在运行demo时确实有sep,想问下,是不是article['input_ids'].append(0)应该是article['input_ids'].append(102)?

can we train by Parallel Computing or Multithreading or multi-Progress? Speed up training thank you

feature request

### System Info When I use the following code on tpuvm and use model.generate() to infer, the speed is very slow. It seems that the tpu is not used. What...

### System Info from cyg_conversation import default_conversation ModuleNotFoundError: No module named 'cyg_conversation' ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks...

bug

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.

https://huggingface.co/decapoda-research/llama-13b-hf How to convert the weights on HF into the format of EasyLM?

想问下loss下降到多少是差不多收敛了,参考下你的训练结果

I would like to ask whether the 100k data of sharegpt4v is brushed with the openai gpt4v interface or the azure gpt4v interface, and what is the version of the...