orama icon indicating copy to clipboard operation
orama copied to clipboard

Upgrading to v3.0.4 Causing Search Failures

Open brajabi opened this issue 11 months ago • 1 comments

Describe the bug

After upgrading from Orama v2.1.0 to v3.0.4, the following issues are occurring: 1. Search functionality not working as expected 2. Incorrect search results being returned 3. Potential regression in core functionality

is the issue is from my configuration and Initialization ?

To Reproduce

Initialize the Orama v2.1.0 with 390K movies names and types

** version: v2.1.0**

terms: the day of the jackal totalTime: 83ms

results: searchResult 
{
  elapsed: {
    raw: 15157208,
    formatted: "15ms",
  },
  hits: [
    {
      id: "127369",
      score: 25.30659695357677,
      document: {
        name: "Substance 2019",
        type: "movie",
        id: "127369",
        alternative_titles: [],
      },
    }, {
      id: "178804",
      score: 25.30659695357677,
      document: {
        name: "Substance 2017",
        type: "movie",
        id: "178804",
        alternative_titles: [],
      },
    }, {
      id: "231633",
      score: 25.30659695357677,
      document: {
        name: "Substance 2014",
        type: "movie",
        id: "231633",
        alternative_titles: [],
      },
    }, {
      id: "422606",
      score: 25.30659695357677,
      document: {
        name: "The Substance 2024",
        type: "movie",
        id: "422606",
        alternative_titles: [],
      },
    }, {
      id: "201079",
      score: 18.301960176081185,
      document: {
        name: "Transworld - Substance 2016",
        type: "movie",
        id: "201079",
        alternative_titles: [],
      },
    }
  ],
  count: 13,
}

terms: the substance totalTime: 17ms

searchResult {
  elapsed: {
    raw: 1301318083,
    formatted: "1s",
  },
  hits: [
    {
      id: "419171",
      score: 40.66199399138986,
      document: {
        name: "The Fall Guy 2024",
        type: "movie",
        id: "419171",
        alternative_titles: [],
      },
    }, {
      id: "250667",
      score: 34.519390266160066,
      document: {
        name: "The Fall Guys 2012",
        type: "movie",
        id: "250667",
        alternative_titles: [],
      },
    }
  ],
  count: 2,
}

terms: The.Fall.Guy.2 totalTime: 1306ms

searchResult {
  elapsed: {
    raw: 866501333,
    formatted: "866ms",
  },
  hits: [
    {
      id: "419171",
      score: 40.66199399138986,
      document: {
        name: "The Fall Guy 2024",
        type: "movie",
        id: "419171",
        alternative_titles: [],
      },
    }, {
      id: "250667",
      score: 34.519390266160066,
      document: {
        name: "The Fall Guys 2012",
        type: "movie",
        id: "250667",
        alternative_titles: [],
      },
    }
  ],
  count: 2,
}

after upgrade to V3.0.4

terms: the day of the jackal totalTime: 118ms

searchResult {
  elapsed: {
    raw: 57279792,
    formatted: "57ms",
  },
  hits: [
    {
      id: "40319",
      score: 30.504768852275934,
      document: {
        name: "Daybreakers 2009",
        type: "movie",
        id: "40319",
        alternative_titles: [],
      },
    }, {
      id: "67293",
      score: 30.504768852275934,
      document: {
        name: "Dayories 2019",
        type: "tv",
        id: "67293",
        alternative_titles: [],
      },
    }, {
      id: "78201",
      score: 30.504768852275934,
      document: {
        name: "Dayaw 2015",
        type: "tv",
        id: "78201",
        alternative_titles: [],
      },
    }, {
      id: "102942",
      score: 30.504768852275934,
      document: {
        name: "Daysteps 2020",
        type: "movie",
        id: "102942",
        alternative_titles: [ "Päeva sammud" ],
      },
    }, {
      id: "131758",
      score: 30.504768852275934,
      document: {
        name: "Dayroom 2019",
        type: "movie",
        id: "131758",
        alternative_titles: [],
      },
    }
  ],
  count: 3215,
}

terms: The.Fall.Guy.2 totaltime: 2479ms

searchResult {
  elapsed: {
    raw: 2475726542,
    formatted: "2s",
  },
  hits: [
    {
      id: "423893",
      score: 42.12362013450947,
      document: {
        name: "2073 2024",
        type: "movie",
        id: "423893",
        alternative_titles: [],
      },
    }, {
      id: "26102",
      score: 42.003348994277964,
      document: {
        name: "2359 2011",
        type: "movie",
        id: "26102",
        alternative_titles: [ "23:59" ],
      },
    }, {
      id: "419523",
      score: 41.00526590045601,
      document: {
        name: "プロフェッショナル 仕事の流儀 ジブリと宮崎駿の2399日 2023",
        type: "movie",
        id: "419523",
        alternative_titles: [],
      },
    }, {
      id: "14735",
      score: 40.60424280622987,
      document: {
        name: "2DTV 2001",
        type: "tv",
        id: "14735",
        alternative_titles: [],
      },
    }, {
      id: "101009",
      score: 40.60424280622987,
      document: {
        name: "24Seven 2001",
        type: "tv",
        id: "101009",
        alternative_titles: [],
      },
    }
  ],
  count: 337833,
}

query: Spider-Man totaltime: 306ms

searchResult {
  elapsed: {
    raw: 101848417,
    formatted: "101ms",
  },
  hits: [
    {
      id: "305673",
      score: 42.07778447860593,
      document: {
        name: "Manuela & Manuel 2007",
        type: "movie",
        id: "305673",
        alternative_titles: [ "Manuela y Manuel" ],
      },
    }, {
      id: "19241",
      score: 39.11599215941477,
      document: {
        name: "Spider-Man (Spiderman) 2002",
        type: "movie",
        id: "19241",
        alternative_titles: [ "Spider-Man" ],
      },
    }, {
      id: "275591",
      score: 39.03097581422258,
      document: {
        name: "Mani Mangalsutra 2010",
        type: "movie",
        id: "275591",
        alternative_titles: [ "मणी मंगळसूत्र" ],
      },
    }, {
      id: "106229",
      score: 36.97952442330601,
      document: {
        name: "Mandibles 2020",
        type: "movie",
        id: "106229",
        alternative_titles: [ "Mandibules" ],
      },
    }, {
      id: "82404",
      score: 35.274289151838246,
      document: {
        name: "Maneater Manhunt 2012",
        type: "tv",
        id: "82404",
        alternative_titles: [],
      },
    }
  ],
  count: 5421,
}

Expected behavior

I expected to the result be the same as V2 version

Environment Info

OS: macos 15.2
Bun v1.1.43
Orama v3.0.4

Affected areas

Search

Additional context

this.movieDB = await create({
        schema: {
          name: 'string',
          type: 'string',
          id: 'string',
          alternative_titles: 'string[]',
        },
        components: {
          tokenizer: {
            stemming: true, 
            language,
            stemmer,
            tokenizeSkipProperties: ['type', 'id'],
            stopWords: englishStopwords,
          },
        },
      })
      
      // and searching like this:
      
      await search(this.movieDB, {
          term: query,
          limit: take,
          properties: ['name', 'alternative_titles'],
          boost: {
            name: 2,
            alternative_titles: 0.5,
          },
        })

brajabi avatar Jan 14 '25 15:01 brajabi

That looks like a tokenization issue. Will look into that (cc. @faustoq, @matijagaspar)

micheleriva avatar Feb 13 '25 17:02 micheleriva

Closed as fixed in v3.1.11

micheleriva avatar Aug 29 '25 01:08 micheleriva