Disable IDF (inverse document frequency) per field
{ Category: content, LuceneQuery: (hideFromSearch:0 +(__NodeTypeAlias:dtcontenttile) +(tileContentOrigination:external^31.0 tileContentOrigination:partner^32.0 tileContentOrigination:originator^33.0)) }
So I'm trying to artificially boost pages scoring by a type, however, beacuse the lowest boosted type is also the lowest by node count it's score is enhanced due to IDF so ends up first in the results and not last, is there anyway to alter that? ta.
Sounds like you know more about this subject than I do ;) I'm not really sure so if you feel like debugging into the cause (prob easiest with a unit test in the solution, there's plenty of examples to get started with) that would be great.
Just a 20min google to try to understand how the scoring worked.. http://www.lucenetutorial.com/advanced-topics/scoring.html
Seems to suggest we can override the idf, though I'd have little idea how to do it.
Also found https://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-relevation/
But as lucenenet 3.0.3 is 5yrs ago, not sure if that means no bm25 support? I can't find anything to suggest what native lucene version equates to lucenenet version (bm25 I think started in lucene 6?)
@mistyn8 I look into bm25, it was introduced in Lucene Release 4.0.0, it means it is not available in older versions of Lucene.
In solr, that is based out of lucene, you need to define a field type with a custom similarity class and use that type in the field
Something like
<fieldType name="custom_txt" class="solr.TextField" positionIncrementGap="100">
<similarity class="com.MySimilarityClass"/>
The custom similarity class
import org.apache.lucene.search.similarities.ClassicSimilarity;
public class MySimilarityClass extends ClassicSimilarity {
@Override
public float idf(long docFreq, long numDocs) {
return 1.0f;
}
}
And the similiarty class can be overriden and imported as a library in your solrconfig.xml
(create Java jar file and import it in your solr directory)
<lib dir="${solr.install.dir:../../../..}/contrib/dataimporthandler/lib/" regex=".*\.jar" />