implicit icon indicating copy to clipboard operation
implicit copied to clipboard

Default K1 and B parameters for bm25_weight

Open fsroque opened this issue 6 years ago • 3 comments

Hi, the default parameters from K1 and B are set to 100 and 0.8 respectively in the function bm25_weight defined in nearest_neighbours.py. The literature suggests values between 0 and 3 for K1 and 0 and 1 for B. Is there a specific reason for K1 to be set this high by default? Might work poorly as a first approach with those parameters.

fsroque avatar Jun 22 '18 08:06 fsroque

I set K1 that way for the last.fm example where it works pretty well, but I agree it is probably a poor default value in general.

I'm going to leave this open for now. One thing I want to do is get better default values for all parameters (across models) by running experiments on multiple datasets and selecting defaults that usually work well, of which K1 etc will be one of them.

benfred avatar Jun 26 '18 18:06 benfred

Any update on this yet @benfred ? Also can anyone help me with the literature for bm_25 weights ?

christopheralex avatar Jan 12 '23 00:01 christopheralex

Any update on this yet @benfred ? Also can anyone help me with the literature for bm_25 weights ?

I think you can relate to this blog to understand bm_25 clear: https://kmwllc.com/index.php/2020/03/20/understanding-tf-idf-and-bm-25/

singsinghai avatar Feb 15 '23 11:02 singsinghai