bert_score
bert_score copied to clipboard
Tensor size error with multiple refs tweets
I am trying to compare each of a large number of candidate sentences (N = 28,830) to a list of the same reference sentences (N = 382) and get the highest value of BERTScore for each candidate sentence (against any of the 382 reference statements) returned.
I'm doing this by implementing the object oriented BERTScorer method, and where cands is a list of str and refs is
list of list of str.
I've been able run the code sucessfully for some subsections of the cands list, but have often gotten the following error message when running the score function:

I think it might have something to do with the ratio of the len(cands) and len(refs[i]) and perhaps also the batch_size parameter?
Model I'm using is:
scorer = BERTScorer(lang = 'en', model_type="vinai/bertweet-covid19-base-uncased", num_layers=12, idf=True, idf_sents = idf_sents_corpus)
Many thanks in advance!