Junwei Huang comments

Results 12 comments of


                                            Junwei Huang

Test result comparison without vs with document-level features

Here is the doc_ner_best.yaml content. ``` Controller: model_structure: null MFVI: hexa_rank: 150 hexa_std: 1 iterations: 3 normalize_weight: true quad_rank: 150 quad_std: 1 tag_dim: 150 use_hexalinear: false use_quadrilinear: false use_second_order: false...

Test result comparison without vs with document-level features

I have this error using the default embedding names. This is why I shortened it. Any suggestions? Thank you. ``` [2022-07-29 10:35:23,903 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\96435fa287fbf7e469185f1062386e05a075cadbf6838b74da22bf64b080bc32.99bcd55fc66f4f3360bc49ba472b940b8dcf223ea6a345deb969d607ca900729...

Test result comparison without vs with document-level features

Thanks for your comments. I will uncomment `if '/' in name: name = name.split('/')[-1]` to train my model. The current problem is with this line, I can run `python .\train.py...

Test result comparison without vs with document-level features

Yes, the result is identical to the original run. I printed it on screen. The following is the longest output I can see now. Let me know if you need...

Test result comparison without vs with document-level features

Following your suggestion, I have this list ``` ['/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 'Word: en', 'bert-base-cased', 'bert-base-multilingual-cased', 'bert-large-cased', 'elmo-original', 'lm-jw300-backward-v0.1.pt', 'lm-jw300-forward-v0.1.pt', 'news-backward-0.4.1.pt', 'news-forward-0.4.1.pt'] ``` I got the same F1 score result. Is...

Test result comparison without vs with document-level features

Thank you. Is it now in the right order ``` 2022-08-04 13:21:03,082 Setting embedding mask to the best action: tensor([1., 1., 0., 1., 0., 0., 1., 0., 0., 1., 1.,...

Test result comparison without vs with document-level features

Yes, the 3 Transformer Embeddings are from OneDrive and saved under resources. ``` TransformerWordEmbeddings-0: layers: '-1' model: resources/en-xlnet-large-cased/xlnet-large-cased # the path to the fine-tuned model embedding_name: /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased pooling_operation: first v2_doc:...

Test result comparison without vs with document-level features

Thank you very much. I have replicated your result ``` 2022-08-18 17:45:37,725 0.9415 0.9495 0.9455 2022-08-18 17:45:37,725 MICRO_AVG: acc 0.8967 - f1-score 0.9455 MACRO_AVG: acc 0.8776 - f1-score 0.932575 LOC...

AttributeError: 'NoneType' object has no attribute 'tokenize'

Still have error using the updated train.py. Here is the error message: ``` [2022-07-29 21:50:39,054 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-spiece.model from cache at C:\Users\ebb/.cache\torch\transformers\5b125ba222ff82664771f63cd8fac9696c24b403fc1ab720d537fe2ceaaf0576.8b10bd978b5d01c21303cc761fc9ecd464419b3bf921864a355ba807cfbfafa8 Traceback (most recent call last): File ".\train.py",...

AttributeError: 'NoneType' object has no attribute 'tokenize'

Another error: ``` 2022-08-01 17:33:40,842 Reading data from datasets\mytest 2022-08-01 17:33:40,843 Train: datasets\mytest\doc_train.txt 2022-08-01 17:33:40,844 Dev: None 2022-08-01 17:33:40,844 Test: None Traceback (most recent call last): File ".\train.py", line 368,...