UER-py icon indicating copy to clipboard operation
UER-py copied to clipboard

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Results 125 UER-py issues
Sort by recently updated
recently updated
newest added

我在训练BERT时使用了BookCorpus+Wikipedia-en数据,训练参数设置了batch_size=5120,warmup=0.1,learning_rate=4e-4,使用deep_init,没有用混合精度,steps(计算了40个epochs)=240k。但是在127k步左右突然Loss增大性能下降,且之后模型停止学习。请问这个可能是什么原因导致? (Log如下所示) ![B0XI4S(0F6A~WNMT{EC8BJF](https://user-images.githubusercontent.com/26522478/183007327-7579d3a3-81ff-474a-8c56-22ae5a953a45.png) 之后模型就一直无法学习了 ![33~Q)}R R54A)3U1V9BOC5I](https://user-images.githubusercontent.com/26522478/183007415-31c88792-fc19-4702-ac51-253e259ce835.png)

![image](https://user-images.githubusercontent.com/70731194/177521512-91a75186-415e-4f7f-80c4-af01ff5bd255.png) 预训练模型:https://huggingface.co/uer/chinese_roberta_L-12_H-768 训练命令: python finetune/run_classifier_siamese.py --pretrained_model_path sbert_base/pytorch_model.bin --vocab_path models/google_zh_vocab.txt --config_path models/sbert/base_config.json --train_path datasets/ChineseTextualInference/train.tsv --dev_path datasets/ChineseTextualInference/dev.tsv --learning_rate 5e-5 --epochs_num 5 --batch_size 64 huggingface上, 有人在用这个代码训练文本相似度模型,效果挺好,但我按照相同方法,无法复现模型。huggingface地址:https://huggingface.co/uer/sbert-base-chinese-nli。 请赐教,不胜感激。!!!

During training, the tgt of the loss function is torch.arange(batch_size), which is suitable for unsupervised training. But this will overwrite the labels of the supervised training set.

I use convert_bert_from_uer_to_huggingface to convert pre-trained model bin file to huggingface model bin file. But it does not generate a directory which can be used in transformers. When loading it,...

使用经过脚本转换后的huggingface上的mengzi-t5-base模型时报错: ``` RuntimeError: Error(s) in loading state_dict for Model: size mismatch for embedding.word_embedding.weight: copying a param with shape torch.Size([32128, 768]) from checkpoint, the shape in current model is torch.Size([32028, 768])....

您好,ELECTRA预训练模型是否有支持转换至UER格式?谢谢 ELECTRA预训练模型:https://github.com/ymcui/Chinese-ELECTRA

通过下面命令对孪生网络进行微调时,报错。 `python finetune/run_classifier_siamese.py --pretrained_model_path chinese_roberta/pytorch_model.bin --vocab_path chinese_roberta/vocab.txt --config_path chinese_roberta/config.json --train_path datasets/ChineseTextualInference/train.tsv --dev_path datasets/ChineseTextualInference/dev.tsv --learning_rate 5e-5 --epochs_num 2 --batch_size 64 ` 错误如下: ``` Traceback (most recent call last): File "finetune/run_classifier_siamese.py", line...

用半个月前的代码还行,最近更新的代码会在模型开始的时候卡死,log 如下,四台机器到了这一步之后就卡死了。 2022-06-06 17:06:30.084 ts-3b9674b191ea46a69359406136b6418c-launcher:1076:1076 [0] NCCL INFO Launch mode Parallel 2022-06-06 17:06:30.148 [2022-06-06 17:06:30,147 INFO] Worker 0 is training ... 2022-06-06 17:06:30.149 [2022-06-06 17:06:30,149 INFO] Worker 3 is training...

在enWiki数据(另外我们是从头开始训练的,数据量是从enwiki抽了378万个句子对,dup_factor设置为了5)上训练20w个steps,Batch size是32,Scheduler是linear+warmup,acc_mlm大概在0.5左右,这是正常现象吗?是否有Log文件可以参考,非常感谢!