orama
orama copied to clipboard
Relevance of schema fields
Is your feature request related to a problem? Please describe.
It would be great if the schema could allow specifying the relevance for individual fields. For example, if you have a schema with a title
and a content
field, then in many cases a match on the title implies that the result is more relevant than others where the keyword just matches on the content.
Describe the solution you'd like
Here is an example API that makes title
10x more important for ranking than content
:
const db = create({
schema: {
content: {relevance: 10, type: 'string'},
title: {relevance: 100, type: 'string'},
},
});
Describe alternatives you've considered
One could issue multiple calls to search
, for each property, and then manually merge the results and rank them. That's a lot of work!
If this feature does get taken under consideration then I suggest specifying relevancy during the search in the properties
field is a better option. For eg:
const searchResult = search(db, {
term: 'foo',
properties: [{ name: 'content', relevance: 10 }, { name: 'title', relevance: 100 }]
});
For comparison sake, this is the current way of using the properties
field during the search:
const searchResult = search(db, {
term: 'foo',
properties: ['content', 'title']
});
This will keep the schema simple and let us change the relevancy of properties per search.
Great idea @iShibi, I like that proposal!
@cpojer, I think @iShibi's solution will be doable once we get https://github.com/LyraSearch/lyra/pull/63 (or a similar plugin) merged. We could easily use a relevance
value from 0.01
to <1
as a multiplication factor for TF-IDF scores
Update: we have the first alpha version for a TF-IDF plugin: https://github.com/LyraSearch/plugin-token-relevance
Update: we've moved the plugin as part of the core of Lyra. Would appreciate reviews here: https://github.com/LyraSearch/lyra/pull/169
Released in v0.4.2