jekyll-tagging-related_posts
jekyll-tagging-related_posts copied to clipboard
Request to change how relevance is calculated
Please, consider the following example:
tags | count | match | score | % | |
---|---|---|---|---|---|
my-post | cat dog | 2 | |||
post-1 | ant bee cat cow | 4 | 1 | 0,1250 | 12,5% |
post-2 | ant bee cat cow dog eel fox goat | 8 | 2 | 0,2500 | 25,0% |
post-3 | cat | 1 | 1 | 0,5000 | 50,0% |
post-4 | dog | 1 | 1 | 0,5000 | 50,0% |
post-5 | bee fox | 2 | 0 | 0,0000 | 0,0% |
post-6 | cat dog | 2 | 2 | 1,0000 | 100,0% |
post-7 | 0 | 0 | 0,0000 | 0,0% | |
post-8 | cat cow | 2 | 1 | 0,2500 | 25,0% |
post-9 | ant cat dog | 3 | 2 | 0,6667 | 66,7% |
post-10 | ant bee cat cow eel fox goat | 7 | 1 | 0,0714 | 7,1% |
post-11 | ant bee cat dog eel fox goat | 7 | 2 | 0,2857 | 28,6% |
post-12 | cow | 1 | 0 | 0,0000 | 0,0% |
I want to calculate relevance (score) for my-post with 2 tags: cat dog
So, what I can get very easily and what your plugin is already doing is:
post.count = number of tags for that post post.match = number of matching tags with my-post (that is your current score, if I'm not mistaking)
I would like to make score more relevant by adding some basic calculation to increase relevance accuracy. At the moment you're using simply number of matching tags, which might not be as accurate and relevant if the fraction of matching tags is much lower than the total number of tags for that post.
Consider post-10 and post-3 from the table above. By using only number of matching tags both those examples are equally relevant with the same number of matching tags. However in practice that is not true. post-3 is much more relevant as it has exactly 1 matching tag with my-post. While post-10 has 7 tags and only 1 tag matches my-post. So obviously that post should be less relevant.
With my calculation post-3 has a score of 50% or 0.5 which is higher and more relevant than post-10 with a score of 7.1% or 0.0714.
The calculation is very simple:
if my-post.count is 0 // avoid division by 0
there are no relevant posts so set all scores to 0 and exit
endif
for each post do
if post.count is 0 // avoid division by 0
score = 0
else
score = ( post.match / post.count ) * ( post.match / my-post.count )
endif
endfor
What you get is a score which should be type float and between 0 and 1, with 0 no relevance and 1 exact match and most relevant. To change score to percentage just multiply it by 100.
Please consider modifying your excellent plugin as I would really like to have more relevant tags for my project. I would fork and modify your plugin but I don't "speak" ruby, so I have no idea how to implement this myself. Thanks.