PaddleNLP
PaddleNLP copied to clipboard
About domain adapation of ernie-3.0
Hello, if I have lots of domain unlabel text data, can I continue training domain-adapated ernie-3.0 with the interface of AutoModelForPretraining? Thanks a lot.
Hello, sure we can! we support the pre-training of ERNIE in following folder.
https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-1.0
you can try it!
Hello, sure we can! we support the pre-training of ERNIE in following folder.
https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-1.0
you can try it!
Thanks for your quick reply. I don't want to train the model from scratch, but continue training nsp and mlm tasks from ernie-3.0-base-zh checkpoint. I tried loaded the model using AutoModelForPretraining.from_pretrained
, and print the model. But I only find the task seq_relationship
in the network? Where's the MLM task? Will this path work?
...
(cls): ErniePretrainingHeads(
(predictions): ErnieLMPredictionHead(
(transform): Linear(in_features=768, out_features=768, dtype=float32)
(layer_norm): LayerNorm(normalized_shape=[768], epsilon=1e-12)
)
(seq_relationship): Linear(in_features=768, out_features=2, dtype=float32)
)
It should have ernie
model in the network.
https://github.com/PaddlePaddle/PaddleNLP/blob/ec893dd05399ae4dd666a9628c58045c85d74d0a/paddlenlp/transformers/ernie/modeling.py#L1478-L1499
Where's the MLM task? Will this path work?
The MLM task is integrated with the ernie-1.0
pre-training task. AutoModelForPretraining
is work, but the mlm task is in ernie-1.0
, so, we suggest you use the model_zoo/ernie-1.0
https://github.com/PaddlePaddle/PaddleNLP/blob/ec893dd05399ae4dd666a9628c58045c85d74d0a/model_zoo/ernie-1.0/run_pretrain.py#L100-L128
I don't want to train the model from scratch, but continue training nsp and mlm tasks from ernie-3.0-base-zh checkpoint.
Yes, we also support continue_training
argument, which is training form inner pre-trained weight.
@ZHUI 请问一下,有没有使用ERNIE-3.0进行领域预训练文档呢,毕竟3.0性能更好。 祝好!
@ZHUI 请问一下,有没有使用ERNIE-3.0进行领域预训练文档呢,毕竟3.0性能更好。 祝好!
你好,目前的代码已经可以支持 使用 ERNIE-3.0 进行预训练
https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-1.0
此教程中,修改 ernie-1.0-base-zh
-> ernie-3.0-base-zh
即可完成 ernie-3.0
的领域预训练
Thanks a lot.