bert_score icon indicating copy to clipboard operation
bert_score copied to clipboard

Tensor size error with multiple refs tweets

Open tierney6 opened this issue 2 years ago • 0 comments
trafficstars

I am trying to compare each of a large number of candidate sentences (N = 28,830) to a list of the same reference sentences (N = 382) and get the highest value of BERTScore for each candidate sentence (against any of the 382 reference statements) returned.

I'm doing this by implementing the object oriented BERTScorer method, and where cands is a list of str and refs is list of list of str.

I've been able run the code sucessfully for some subsections of the cands list, but have often gotten the following error message when running the score function:

Screen Shot 2023-04-20 at 5 51 16 pm

I think it might have something to do with the ratio of the len(cands) and len(refs[i]) and perhaps also the batch_size parameter?

Model I'm using is:

scorer = BERTScorer(lang = 'en', model_type="vinai/bertweet-covid19-base-uncased", num_layers=12, idf=True, idf_sents = idf_sents_corpus)

Many thanks in advance!

tierney6 avatar Apr 20 '23 21:04 tierney6