flair
flair copied to clipboard
[Question]: Low and different results when reload the final_model.pt
Question
I have used this code to train ner model.
tagger : SequenceTagger = SequenceTagger(hidden_size=128,
embeddings=embeddings,
tag_dictionary=tag_dictionary,
tag_type=tag_type,
tag_format="BIO",
use_rnn=True,
use_crf=True)
trainer : ModelTrainer = ModelTrainer(tagger,corpus )
trainer.train(f'train/{folder}/model3',
learning_rate=0.01,
min_learning_rate= 0.0001,
mini_batch_size=64,
embeddings_storage_mode='none',
max_epochs=80,
patience=3,
train_with_dev=True,
)
I got this result after the training
2024-01-19 03:28:19,510 Testing using last state of model ...
2024-01-19 03:28:34,248
Results:
- F-score (micro) 0.9024
- F-score (macro) 0.9026
- Accuracy 0.8443
By class:
precision recall f1-score support
I 0.8624 0.9402 0.8996 1087
B 0.8879 0.9240 0.9056 960
micro avg 0.8741 0.9326 0.9024 2047
macro avg 0.8752 0.9321 0.9026 2047
weighted avg 0.8744 0.9326 0.9024 2047
However, when I load the model from the file, I get very very low results. Any explanation please:
tagger = SequenceTagger.load("train/NCBI-disease/model3/final-model.pt")
columns = {0 : 'text', 1 : 'ner', 2:'pos'}
corpus : Corpus = ColumnCorpus(data_folder, columns, test_file = 'test.tsv',)
print(tagger.evaluate(corpus.test,'ner').detailed_results)
Results:
- F-score (micro) 0.0125
- F-score (macro) 0.0069
- Accuracy 0.0063
By class:
precision recall f1-score support
I 0.0077 0.0727 0.0139 1087
B 0.0000 0.0000 0.0000 960
micro avg 0.0075 0.0386 0.0125 2047
macro avg 0.0038 0.0363 0.0069 2047
weighted avg 0.0041 0.0386 0.0074 2047
#Check dev
print(tagger.evaluate(corpus.dev,'ner').detailed_results)
Results:
- F-score (micro) 0.0182
- F-score (macro) 0.01
- Accuracy 0.0092
By class:
precision recall f1-score support
I 0.0110 0.1037 0.0199 1090
B 0.0000 0.0000 0.0000 787
micro avg 0.0107 0.0602 0.0182 1877
macro avg 0.0055 0.0518 0.0100 1877
weighted avg 0.0064 0.0602 0.0116 1877
Could you please help me ASAP.
Also, Similar issue when set train_with_dev=False, and reload 'best-model.pt'
Hi @Tinarights
I did not manage to reproduce this with the information provided.
However I noticed that the classes are called B
and I
.
Am I right to assume, that that are not your intended class names? (E.g. you didn't set the token-labels to be [B-B
, B-I
, I-B
, I-I
, O
])
If you want to detect an entities with a single label, you should still provide a label name for it, e.g. using [B-Entity
, I-Entity
, O
] as the possibe token-labels