OpenNIR
OpenNIR copied to clipboard
getting error with vbert
While using vbert, I am getting the error. Please help.
vbert = onir_pt.reranker('vanilla_transformer', 'bert', text_field='abstract', vocab_config={'train': True})
vbert_pipeline = (
pt.BatchRetrieve(index,wmodel='BM25',metadata=["docno", "text"]) % 1000
>>pt.text.get_text(index,"text")
>>vbert
)
df_res= vbert_pipeline.search("can vitamin d cure covid 19")
[2021-09-02 01:10:08,346][onir_pt][DEBUG] using GPU (deterministic)
[2021-09-02 01:10:11,481][onir_pt][DEBUG] [starting] batches
[2021-09-02 01:10:11,485][onir][CRITICAL] Uncaught exception
Traceback (most recent call last):
File "vbert_baseline.py", line 123, in
Hi @somnath-banerjee,
Sorry for the delay. It looks like the vbert model is trying to re-rank based on the "abstract" field (text_field='abstract'), whereas only a "text" field is available (metadata=["docno", "text"]). I think switching to text_field='text' should resolve your problem!
Hi @seanmacavaney,
Thanks. It worked with changing the text_field = 'text'.
I am getting some scores that are negative. I am new to IR. I wonder if you kindly let me know how can I interpret this from a theoretical point of view.
Thanks in advance.
Yes, so the query-document relevance scores produced by the model are only valuable with respect to other query-document relevance scores. In other words, the only thing that matters is that document A's score is greater or less than document B's -- this determines the order of the two documents in the rankings.
Some other models make stronger claims about the meaning of the scores produced. For instance, probabilistic models frame the scores as a probability.
Thanks a lot for your answer. But if the vbert model produces a negative score for a query-document, what does this mean? How it differs from a query-document for which it gives the positive score?