UER-py issues

Results 125 UER-py issues

Sort by recently updated

在训练BERT时， Loss突然增大且模型无法继续学习

我在训练BERT时使用了BookCorpus+Wikipedia-en数据，训练参数设置了batch_size=5120，warmup=0.1，learning_rate=4e-4，使用deep_init，没有用混合精度，steps(计算了40个epochs)=240k。但是在127k步左右突然Loss增大性能下降，且之后模型停止学习。请问这个可能是什么原因导致？ (Log如下所示) ![B0XI4S(0F6A~WNMT{EC8BJF](https://user-images.githubusercontent.com/26522478/183007327-7579d3a3-81ff-474a-8c56-22ae5a953a45.png) 之后模型就一直无法学习了 ![33~Q)}R R54A)3U1V9BOC5I](https://user-images.githubusercontent.com/26522478/183007415-31c88792-fc19-4702-ac51-253e259ce835.png)

xlxwalex

微调 finetune/run_classifier_siamese.py 模型不收敛。

![image](https://user-images.githubusercontent.com/70731194/177521512-91a75186-415e-4f7f-80c4-af01ff5bd255.png) 预训练模型：https://huggingface.co/uer/chinese_roberta_L-12_H-768 训练命令： python finetune/run_classifier_siamese.py --pretrained_model_path sbert_base/pytorch_model.bin --vocab_path models/google_zh_vocab.txt --config_path models/sbert/base_config.json --train_path datasets/ChineseTextualInference/train.tsv --dev_path datasets/ChineseTextualInference/dev.tsv --learning_rate 5e-5 --epochs_num 5 --batch_size 64 huggingface上，有人在用这个代码训练文本相似度模型，效果挺好，但我按照相同方法，无法复现模型。huggingface地址：https://huggingface.co/uer/sbert-base-chinese-nli。请赐教，不胜感激。！！！

yuanfengning

Does finetune/run_simcse.py only support unsupervised training?

During training, the tgt of the loss function is torch.arange(batch_size), which is suitable for unsupervised training. But this will overwrite the labels of the supervised training set.

Eric8932

不支持M1芯片

Eggwardhan

bug

convert_bert_from_uer_to_huggingface script does not generate config.json file

I use convert_bert_from_uer_to_huggingface to convert pre-trained model bin file to huggingface model bin file. But it does not generate a directory which can be used in transformers. When loading it,...

chenweiyj

预训练时加载huggingface T5模型报错

使用经过脚本转换后的huggingface上的mengzi-t5-base模型时报错： ``` RuntimeError: Error(s) in loading state_dict for Model: size mismatch for embedding.word_embedding.weight: copying a param with shape torch.Size([32128, 768]) from checkpoint, the shape in current model is torch.Size([32028, 768])....

fade-color

ELECTRA预训练模型转换至UER格式

您好，ELECTRA预训练模型是否有支持转换至UER格式？谢谢 ELECTRA预训练模型：https://github.com/ymcui/Chinese-ELECTRA

sl403

AttributeError: 'Namespace' object has no attribute 'stream_0'

通过下面命令对孪生网络进行微调时，报错。 `python finetune/run_classifier_siamese.py --pretrained_model_path chinese_roberta/pytorch_model.bin --vocab_path chinese_roberta/vocab.txt --config_path chinese_roberta/config.json --train_path datasets/ChineseTextualInference/train.tsv --dev_path datasets/ChineseTextualInference/dev.tsv --learning_rate 5e-5 --epochs_num 2 --batch_size 64 ` 错误如下： ``` Traceback (most recent call last): File "finetune/run_classifier_siamese.py", line...

yuanfengning

新代码多机多卡会卡死

用半个月前的代码还行，最近更新的代码会在模型开始的时候卡死，log 如下，四台机器到了这一步之后就卡死了。 2022-06-06 17:06:30.084 ts-3b9674b191ea46a69359406136b6418c-launcher:1076:1076 [0] NCCL INFO Launch mode Parallel 2022-06-06 17:06:30.148 [2022-06-06 17:06:30,147 INFO] Worker 0 is training ... 2022-06-06 17:06:30.149 [2022-06-06 17:06:30,149 INFO] Worker 3 is training...

seeledu

请问一般BERT预训练的acc_mlm大概训练完后能到多少

在enWiki数据(另外我们是从头开始训练的，数据量是从enwiki抽了378万个句子对，dup_factor设置为了5)上训练20w个steps，Batch size是32，Scheduler是linear+warmup，acc_mlm大概在0.5左右，这是正常现象吗？是否有Log文件可以参考，非常感谢！

xlxwalex

UER-py
UER-py copied to clipboard

Metadata

在训练BERT时， Loss突然增大且模型无法继续学习

微调 finetune/run_classifier_siamese.py 模型不收敛。

Does finetune/run_simcse.py only support unsupervised training?

不支持M1芯片

convert_bert_from_uer_to_huggingface script does not generate config.json file

预训练时加载huggingface T5模型报错

ELECTRA预训练模型转换至UER格式

AttributeError: 'Namespace' object has no attribute 'stream_0'

新代码多机多卡会卡死

请问一般BERT预训练的acc_mlm大概训练完后能到多少

← Metadata

Owner

Metadata

UER-py UER-py copied to clipboard

Metadata

← Metadata

Owner

Metadata

UER-py
UER-py copied to clipboard