orama icon indicating copy to clipboard operation
orama copied to clipboard

Score gets bigger than 1, threshold not functional

Open marconett opened this issue 9 months ago • 1 comments

Describe the bug

According to the docs (https://docs.orama.com/open-source/usage/search/introduction#what-does-the-search-method-return), score should be between 0 and 1.

Using the example data from the threshold doc (https://docs.orama.com/open-source/usage/search/threshold), scores are between 0 and 1 and filtering results based on thresholds works.

But with the data I am working with, the score get's bigger than 1, which also leads to threshold being useless.

As an example, I used the stopwords from this library to show this behavior.

To Reproduce

import { create, insertMultiple, search } from "@orama/orama"
import { stopwords } from '@orama/stopwords/english'

const db = create({
  schema: {
    title: 'string',
  },
})

const getRandomWord = () => ' ' + stopwords[Math.floor(Math.random() * stopwords.length)];

insertMultiple(db, [
  ...stopwords.map(word => ({ title: word })),
  ...stopwords.map(word => ({ title: word + getRandomWord() }))
]);

const result = search(db, {
  term: 'her',
  threshold: 0,
});

console.log(result.hits.map(hit => {
  return {
    title: hit.document.title,
    score: hit.score,
  }
}));

Output:

[
  { title: 'hers', score: 6.584305791656615 },
  { title: 'herself', score: 6.584305791656615 },
  { title: "here's", score: 6.584305791656615 },
  { title: 'here', score: 6.227383875508277 },
  { title: 'her', score: 5.9423872295572 },
  { title: 'her her', score: 5.9423872295572 },
  { title: 'herself which', score: 3.70476794591908 },
  { title: "here's your", score: 3.70476794591908 },
  { title: "how's hers", score: 3.70476794591908 },
  { title: 'before herself', score: 3.70476794591908 }
]

Expected behavior

  • Score should be between 0 and 1.
  • Threshold should work as documented.

Environment Info

OS: MacOS 15.3.2
Node: 18.20.5
Orama: 3.1.2

Affected areas

Search

Additional context

No response

marconett avatar Mar 12 '25 13:03 marconett

Hi @marconett, we just released Orama v3.1.6 with a fix on the threshold. Would you mind testing if your issue is solved? About scores being > 1, I should probably update the docs. We can rescale the scores to be between 0 and 1... but I'm not sure what the advantage would eventually be from a technical standpoint. Please tell me if I missed something!

micheleriva avatar Apr 15 '25 15:04 micheleriva

FYI @micheleriva the docs still need to be updated

omonk avatar Aug 06 '25 11:08 omonk