pyterrier
pyterrier copied to clipboard
Excessive logging about termpipelines global defaults
Hi there, I hope this is a simple oversight on my part, however I can't seem to disable the following warning:
[main] WARN org.terrier.querying.ApplyTermPipeline - The index has no termpipelines configuration, and no control configuration is found. Defaulting to global termpipelines configuration of 'Stopwords,PorterStemmer'. Set a termpipelines control to remove this warning.
I have tried the following:
# Upon init
pt.set_property('termpipelines', 'Stopwords,PorterStemmer')
# When loading my index
indexer.setProperties(**{'termpipelines' : 'Stopwords,PorterStemmer'})
# When using BatchRetrieve (the warning doesn't appear to be triggered by this call, however)
pipeline = pt.BatchRetrieve(
self.index_ref,
wmodel='BM25',
properties={'termpipelines' : 'Stopwords,PorterStemmer'}
)
I am pretty sure that my use of TextScorer is triggering this warning, but I do not see any properties / args to set termpipelines. Here is my current usage:
textscorer = pt.text.scorer(
body_attr='text',
wmodel='BM25',
background_index=self.index
)
textscorer.transform(test_df)
Please let me know if there is something I am missing, or if I have stumbled across an oversight in TextScorer. Thanks!
Hi @joelrorseth
Thanks for the report.
The recent PyTerrier and Terrier releases changes the way we address termpipelines for DISK indices. I'm going to leave this open as I dont think we have well addressed it for pt.text.scorer() yet.