skye95git

Results 41 comments of skye95git

> You can download the model from huggingface and convert it to UER format by the script https://github.com/dbiir/UER-py/blob/master/scripts/convert_bert_from_huggingface_to_uer.py So, RoBERTa and BERT share the same script, right? No matter converting...

> You can use `--target` to select BERT or RoBERTa If I want to use RoBERTa, can I just set `--target` to MLM?

> Yes How to set `--layers_num` when I convert huggingface RoBERTa to UER? There are different choices in the examples. Do you set the number of layers of Transformer according...

> You can download the model from huggingface and convert it to UER format by the script https://github.com/dbiir/UER-py/blob/master/scripts/convert_bert_from_huggingface_to_uer.py Hi, I have downloaded the RoBERTa from huggingface: https://huggingface.co/roberta-base/tree/main. Then I run...

Hi, If I want use English corpus to pre-train RoBERTa from scratch, Which dictionary should I use? I use the `vocab.json` downloaded from huggingface, but the instance show None: ```...

> Hello, this is because the model parameter names do not match due to the too old version of Transformers used by the model. We have modified it in the...

> Hi,You can use this script https://github.com/dbiir/UER-py/blob/master/scripts/convert_xlmroberta_from_huggingface_to_uer.py The script can be used to convert successfully. But the transformed model seems a little different from Roberta. When I do incremental pre-training,...

@hhou435 Hi, every time I train MLM with my own data, the log freezes after printing the following message: ``` Start slurm job at Thu 23 Dec 2021 02:41:12 PM...

The loss of pre-training is shown below: ![image](https://user-images.githubusercontent.com/41561936/147209959-3d0917f3-9c32-49e2-ab5d-61832a4ec14c.png) Can the loss state be regarded as convergence?

> > > Hi,You can use this script https://github.com/dbiir/UER-py/blob/master/scripts/convert_xlmroberta_from_huggingface_to_uer.py > > > > > > The script can be used to convert successfully. But the transformed model seems a little...