FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Provide a normalized algorithm for compute lexical similar score

Open IcyTide opened this issue 11 months ago • 1 comments

Same sentences can always get a "1" simirlar score like dense way but not a score less than 1 and change with different sentence content.
Different sentences can get an more even similar score distribution.

IcyTide avatar Mar 22 '24 03:03 IcyTide

Same sentences results: screenshots

Different sentences results: screenshots1

IcyTide avatar Mar 22 '24 04:03 IcyTide

Thanks for your contribution! This method may change the ranking list, so we need some time to conduct experiments to evaluate its performance.

staoxiao avatar Mar 24 '24 07:03 staoxiao

Thanks for your contribution! This method may change the ranking list, so we need some time to conduct experiments to evaluate its performance.

This approach might be more explainable for applications compared to the original method, therefore it could perhaps be considered as an additional method, but not a replacement for the original one (depending on the results of your experiments).

IcyTide avatar Mar 25 '24 01:03 IcyTide