BERT-Disaster-Classification-Capsule-Routing icon indicating copy to clipboard operation
BERT-Disaster-Classification-Capsule-Routing copied to clipboard

Follow-up question

Open wanghuijia326 opened this issue 5 years ago • 7 comments

Thank you for your help, I successfully ran the program with my own data. I would like to ask, is your bert original or is it pre-trained with tweet data in the target domain? If it was pre-trained with tweet data in the later stage, can you probably tell me how to train?

wanghuijia326 avatar Jan 02 '20 09:01 wanghuijia326

It's the original pre-trained bert (the multilinguial one) pre-trained by Google on wikipedia and stuff. I didn't pre-train it further with Tweet Data though someday I might.

I have a demo code here: https://github.com/JRC1995/BERT-Disaster-Classification-Capsule-Routing/tree/master/MLM for pre-training with the main Masked LM objective. It's a just a toy code that kind of implements the essence of MLM training with HuggingFace's library - so you have to extend that for full blown pre-training. The demo code is mostly based on Huggingface's documentations. There may be better resources and examples for MLM-pre-training (there are also other multi-task objectives that can it can be trained on).

JRC1995 avatar Jan 02 '20 09:01 JRC1995

For the same corpus, training and testing with the BERT_capsule model are normal, but replaced with the BERT_capsule_BiLSTM_attn model,the F1,P,R of training/test is zero. I don’t know what the reason is, can you provide me with a solution?

wanghuijia326 avatar Jan 06 '20 02:01 wanghuijia326

That's strange. I don't know why. Is the only thing you changed is the model? What is the binary/multi accuracy? And what about the cross entropy loss during training? Are they normal or something strange in it? You can try to debug by print out inside the respective functions to see what is happening. It may be possible that all your samples are being ignored for some reason. Are both binary and multi-F1 zero? I suspect multi-F1. If you are doing binary classification only, you should ignore multi-precision, multi-recall, multi-F1.

JRC1995 avatar Jan 06 '20 02:01 JRC1995

The operation was unsuccessful and could not be downloaded from the Pre_trained_BERT file. There is no JSON file in the Pre_trained_BERT file. Has the file been modified?

zhangshuai19971210 avatar Jun 24 '22 09:06 zhangshuai19971210

Did you locally saved BERT beforehand?: https://github.com/JRC1995/BERT-Disaster-Classification-Capsule-Routing#saving-multilingual-bert https://github.com/JRC1995/BERT-Disaster-Classification-Capsule-Routing/blob/master/Classification/Save_pre_trained_locally.py

If so, could be due to version mismatch issues. You can also try running it in an environment with a older huggingface code (the one in the readme, for example).

JRC1995 avatar Jun 24 '22 14:06 JRC1995

We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like ../Pre_trained_BERT/ is not the path to a directory containing a {configuration_file} file. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'. I ran on my data and this error occurred. The model has been downloaded locally.

zhangshuai19971210 avatar Jun 26 '22 11:06 zhangshuai19971210

After locally downloading bert does the files show up in the Pre_trained_BERT folder or is it empty? The local download file is designed to download all the relevant files to the Pre_trained_BERT folder. There could be some issues with the directory though or some version mismatch issue. Another thing you can do is just do a search for the repo for any model and tokenizer loading code and replace the loading directory with the relevant model name (multilingual bert) from Huggingface. That should download the relevant file to cache and allow it for use automatically.

On Sun, Jun 26, 2022, 4:01 AM zhangshuai19971210 @.***> wrote:

We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like ../Pre_trained_BERT/ is not the path to a directory containing a {configuration_file} file. Checkout your internet connection or see how to run the library in offline mode at ' https://huggingface.co/docs/transformers/installation#offline-mode'. I ran on my data and this error occurred. The model has been downloaded locally.

— Reply to this email directly, view it on GitHub https://github.com/JRC1995/BERT-Disaster-Classification-Capsule-Routing/issues/2#issuecomment-1166494795, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYBQNQMTR6LBCH2BNW6NFTVRA2BNANCNFSM4KCACPFA . You are receiving this because you commented.Message ID: <JRC1995/BERT-Disaster-Classification-Capsule-Routing/issues/2/1166494795@ github.com>

JRC1995 avatar Jun 27 '22 07:06 JRC1995