bert_score
bert_score copied to clipboard
RuntimeError
trafficstars
from bert_score import score
with open("/data/translator/parallel/kaggle/open_ru.txt", "r") as fo:
txts = [t.strip() for t in fo.readlines()]
txts = list(set(txts))
size = 1_000_000
P, R, F1 = score(
txts[0:size], txts[size:size*2], lang="ru", verbose=True,
model_type="DeepPavlov/rubert-base-cased",
num_layers=12
# rescale_with_baseline=True
)
Some weights of the model checkpoint at DeepPavlov/rubert-base-cased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.decoder.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
calculating scores...
computing bert embedding.
2%|▏ | 476/31250 [00:33<36:29, 14.06it/s]
....
RuntimeError: The expanded size of the tensor (1174) must match the existing size (512) at non-singleton dimension 1. Target sizes: [64, 1174]. Tensor sizes: [1, 512]