SimMetrics.Net icon indicating copy to clipboard operation
SimMetrics.Net copied to clipboard

NeedlemanWunch always >= 0.5

Open nsulikowski opened this issue 9 months ago • 2 comments

NeedlemanWunch("aaa", "bbb") returns 0.5. In fact, this function is always above or equal to 0.5

static NeedlemanWunch needleman_wunch = null;

public static double NeedlemanWunch(string firstWord, string secondWord)
{
    if (needleman_wunch == null) needleman_wunch = new NeedlemanWunch();
    return needleman_wunch.GetSimilarity(firstWord, secondWord);
}

ChatGPT found this: https://chatgpt.com/share/67e58f6b-61d0-8008-bb78-6169a1cbf7a6

nsulikowski avatar Mar 27 '25 17:03 nsulikowski

What I understand is that in the case aaa and bbb, and using a mismatch penalty of -1, the total value would be 3 because all mismatch.

However, this NeedlemanWunch logic in this code seems to normalize it? So the value will be 1 - 3 / 6, where the total mismatch value (in case 3 gaps are added is 6)

Does this make sense?

StefH avatar Mar 27 '25 21:03 StefH

I'm not sure about the algorithm, as I haven't spent time learning it. However, I can tell you that the code https://github.com/soenneker/soenneker.utils.string.needlemanwunsch returns 0 for the similarity between "aaa" and "bbb", and 66.66 for "aaa" and "aab" on a scale from 0 to 100. This makes sense to me.

nsulikowski avatar Mar 28 '25 01:03 nsulikowski