meilisearch-js-plugins icon indicating copy to clipboard operation
meilisearch-js-plugins copied to clipboard

Add 'fullyHighlighted', 'matchLevel' and 'matchedWords' to '_highlightResult' in autocomplete client

Open sgilberg opened this issue 1 year ago • 8 comments

Description In the hits returned from getAlgoliaResults, the _highlightResult object includes not just the value of the attribute but also three properties that help to interpret the match: fullyHighlighted, matchLevel, and matchedWords. The getMeilisearchResults hits, in contrast, only include the value. I would like to request the addition of the other three values, if at all possible.

Basic example We are attempting to migrate from Algolia to Meilisearch, but this is a sticking point for us. A real-world example where these properties come into play is one we have frequently: an index of entities that each have a primary name and an array of synonyms, both of which are searchable attributes. In the autocomplete results display, we would like to show just the highlighted name if it is a full match, and if not, we add on the best-matched synonym (showing a fully-highlighted synonym match if one is found, a synonym with matchLevel: full if not, and finally the synonym with the most matched words if no full match level is available).

Here's a real example of what this looks like for us using Algolia's results (showing only the templates definition here for brevity):

templates: {
    item({ item, components, html}) {
        if (item._highlightResult.full_name.matchLevel !== 'full' && item._highlightResult.synonyms) {
            var synonym_match = item._highlightResult.synonyms.find(function(synonym) {
                return synonym.matchLevel === 'full' && synonym.fullyHighlighted;
            });
            if (synonym_match === undefined) {
                synonym_match = item._highlightResult.synonyms.find(function(synonym) {
                    return synonym.matchLevel === 'full';
                });
                if (synonym_match === undefined && item._highlightResult.full_name.matchLevel === 'none') {
                    var synonyms = item._highlightResult.synonyms.filter(function (synonym) {
                        return synonym.matchLevel !== 'none';
                    }).sort(function (a, b) {
                        return b.matchedWords.length - a.matchedWords.length;
                    });
                    synonym_match = synonyms[0];
                }
            }
            if (synonym_match !== undefined) {
                synonym_match._highlightResult = { // Workaround to get the highlight component to work (bonus points if you can improve upon Algolia's current default)
                    name: {
                        value: synonym_match.value
                    }
                };
                return html`<span>${components.Highlight({
                    hit: item,
                    attribute: 'full_name',
                })} <span class="text-muted">- synonym match: ${components.Highlight({
                    hit: synonym_match,
                    attribute: 'name',
                })}</span></span>`;
            }
        }

        return html`<span>${components.Highlight({
            hit: item,
            attribute: 'full_name',
        })}</span>`;
    },
},

sgilberg avatar Oct 29 '24 21:10 sgilberg

Hello @sgilberg, thanks for opening an issue.

This seems like a welcome addition for the autocomplete client. I will start taking a look at this :)

Strift avatar Dec 19 '24 11:12 Strift

Hey, I released a new version that includes the highlights metadata. Let me know if it works for you :)

Strift avatar Dec 27 '24 10:12 Strift

Hi @Strift thank you so much for working on this! This is partly working for me, and partly not. It works pretty well for attributes that are one-to-one, and for nested arrays that don't have keys, but I'm getting errors on indexes that have nested arrays with keys (even if those arrays aren't searchable attributes).

Here's an example record to illustrate what I mean:

{
   "id": 1,
   "name": "Joseph",
   "nicknames": [
      "Joe",
      "Joey"
   ],
   "family_members": [
       {
           "relationship": "mother",
           "name": "Susan"
       },
       {
           "relationship": "father",
           "name": "John"
       }
   ]
}

In the above, the "family_members" style of data breaks the integration ("Cannot read properties of undefined (reading 'replace')" at calculateHighlightMetadata). Indexes that only have fields like id, name, or nicknames work fine.

I think there might be two parts to the issue: one is that these highlight calculations seem to be done on all fields, regardless of whether they are meant to be searchable or highlighted or not, which at best is overkill and at worst might impact performance or cause errors as in my case. The other is that, in cases where such data should be searchable, we'd want that calculation to work on those nested properties too.

For what it's worth I also notice that the match level full vs partial seems to be case dependent, and I'm not sure if that's intentional (or if it's consistent with Algolia's behavior, I'd need to dig in on that). E.g. for the example record above if I search "joseph" it will show as partial but if I search "Joseph" it is full.

sgilberg avatar Jan 02 '25 23:01 sgilberg

Hey, I managed to get it working for the basic cases, but I want to make the logic smart enough to handle recursive scenarios. But I'm not sure if it will actually be relevant.

@meilisearch/product-team How deep in objects, arrays, or arrays of objects on average do customers have "searchable" fields? If you can share some insights with me, it would allow me to prioritize this better. Thanks 🙏

Strift avatar Jan 15 '25 08:01 Strift

Hi @Strift, after updating @meilisearch/autocomplete-client to 0.5.0 or higher, we also get the above mentioned error:

TypeError
Uncaught (in promise) TypeError: can't access property "replace", highlightValue is undefined
    calculateHighlightMetadata fetchMeilisearchResults.ts:116
    _highlightResult fetchMeilisearchResults.ts:75
    mapOneOrMany fetchMeilisearchResults.ts:145
    _highlightResult fetchMeilisearchResults.ts:74
    hits fetchMeilisearchResults.ts:70
    fetchMeilisearchResults fetchMeilisearchResults.ts:63
    fetchMeilisearchResults fetchMeilisearchResults.ts:55
    promise callback*fetchMeilisearchResults fetchMeilisearchResults.ts:53
    createMeilisearchRequester createMeilisearchRequester.ts:6
    execute createRequester.ts:109
    values resolve.js:79
    resolve resolve.js:71
    promise callback*onInput/request< onInput.js:81
    promise callback*onInput onInput.js:72
    onFocus getPropGetters.js:131
    eventProxy setProperties.js:30
    setIsModalOpen autocomplete.js:291
    current autocomplete.js:185
    onStateChange autocomplete.js:46
    onStateChange getDefaultProps.js:57
    onStoreStateChange createAutocomplete.js:34
    dispatch createStore.js:21
    setIsOpen2 getAutocompleteSetters.js:32
    open AutocompleteSearch.vue:58
    setup AutocompleteSearch.vue:83

Hey, I managed to get it working for the basic cases, but I want to make the logic smart enough to handle recursive scenarios. But I'm not sure if it will actually be relevant.

In our project we use the Meilisearch Integration for Statamic, which adds entries/pages with the following document structure to Meilisearch:

Meilisearch document
{
  "id": "entry---<some-id>",
  "content": [
    {
      "type": "set",
      "attrs": {
        "id": "<some-id>",
        "values": {
          "text": "Lorem Impsum",
          "type": "<some-type>",
          "<some-key>": "<some-value>",
          "<another-key>": "<some-value>"
        }
      }
    },
    {
      "type": "set",
      "attrs": {
        "id": "<some-id>",
        "values": {
          "<some-key>": "<some-value>",
          "<another-key>": [
            {
              "id": "<some-id>",
              "<some-key>": "<some-value>",
              "<another-key>": "<some-value>"
            }
          ]
        }
      }
    }
  ],
  "title": "<some-title>",
  "uri": "<some-uri>"
}

As Statamic is a CMS with configurable fields, the above structure is only an example and can probably be nested even more.

I can reproduce the error in your test suite by adding a structure like the example provided in the Meilisearch documentation to a movie:

Test change

https://github.com/meilisearch/meilisearch-js-plugins/blob/67c947836967311775f39e55f5fe2c70369470bf/packages/autocomplete-client/tests/test.utils.ts#L20-L29

{
  id: 2,
  title: 'Ariel',
  overview: "Taisto Kasurinen is a Finnish coal miner whose father has just committed suicide and who is framed for a crime he did not commit. In jail, he starts to dream about leaving the country and starting a new life. He escapes from prison but things don't go as planned...",
  genres: ['Drama', 'Crime', 'Comedy'],
  poster: 'https://image.tmdb.org/t/p/w500/ojDg0PGvs6R9xYFodRct2kdI6wC.jpg',
  release_date: 593395200,
+  cast: [
+    { "Jack Black": "Po" },
+    { "Jackie Chan": "Monkey" }
+  ],
},

TypeError: Cannot read properties of undefined (reading 'replace') when running yarn test.

I'd argue this issue is quite relevant, as the changes in #1347 break existing integrations and can't handle the simple example given in the Meilisearch documentation. Happy to provide further details if required.

vintagesucks avatar Jan 29 '25 14:01 vintagesucks

@Strift Checking in on the status of this, for our own planning. Do you know if it will be prioritized in the near future, or if we should continue to integrate our hacky workaround?

sgilberg avatar Mar 28 '25 17:03 sgilberg

Hi there, we still want to fix this but I have little bandwidth to do so because I am currently busy updating the SDKs to be compatible with the latest Meilisearch versions.

I can't provide a timeline for now, but would gladly review any PRs aiming to tackle this.

Strift avatar Apr 01 '25 05:04 Strift

@Strift Thanks so much for the update. I'm not fluent in Typescript or I would jump in to help...

sgilberg avatar Apr 04 '25 15:04 sgilberg