elasticsuite icon indicating copy to clipboard operation
elasticsuite copied to clipboard

poor matching with accented and capital/lower greek characters and different autocomplete results to search results

Open ioweb-gr opened this issue 1 year ago • 3 comments

Preconditions

Elastic Search 7.6.2

Magento Version : 2.4.1 CE

ElasticSuite Version :"2.10.10"

Environment : Developer

Third party modules : List of enabled modules: Amasty_Base Amasty_CronScheduleList Amasty_Geoip Amasty_GdprCookie Amasty_Gdpr Ced_VivaPayments Codazon_AjaxCartPro Codazon_Core Codazon_GoogleAmpManager Codazon_ImproveBundle Codazon_Lookbookpro Codazon_MegaMenu Codazon_ProductFilter Codazon_ThemeOptions Codazon_QuickShop Codazon_ShippingCostCalculator Codazon_Shopbybrandpro Codazon_Slideshow Codazon_ProductLabel Codazon_Utility Dotdigitalgroup_Email Dotdigitalgroup_Chat Dotdigitalgroup_Sms Flurrybox_EnhancedPrivacy Fooman_PrintOrderPdf Harrigo_EverCrumbs Ho_Templatehints Ioweb_AmastyGdprAddon Ioweb_Base Ioweb_CspWhitelist Ioweb_Customizer Ioweb_Label Ioweb_MariaDbHotfix Ioweb_OrderComments Ioweb_RestrictPayment Ioweb_RestrictShipping Ioweb_TaxDisplay Ioweb_Widgets MagePal_Core MagePal_EditOrderEmail Magefan_Community Magefan_Blog Magefan_WysiwygAdvanced Mageplaza_Core Mageplaza_GoogleRecaptcha Mageplaza_Multiflatrates Mageplaza_Smtp Mageplaza_SocialLogin Mageprince_Paymentfee Magonex_DateTime Nwdthemes_Base Nwdthemes_Revslider Plumrocket_Base Plumrocket_SocialLoginFree Smile_ElasticsuiteAdminNotification Smile_ElasticsuiteCore Smile_ElasticsuiteCatalog Smile_ElasticsuiteCatalogGraphQl Smile_ElasticsuiteCatalogRule Smile_ElasticsuiteCatalogOptimizer Smile_ElasticsuiteTracker Smile_ElasticsuiteThesaurus Smile_ElasticsuiteSwatches Smile_ElasticsuiteIndices Smile_ElasticsuiteAnalytics Smile_ElasticsuiteVirtualCategory Yosto_AddressAttribute Yosto_AttributeRelation Yosto_CustomerAddress Yosto_CustomerAttribute Yosto_OrderAttribute Yotpo_Yotpo

Steps to reproduce

  1. Create a few products with Greek names with accented characters e.g. κοριτσίστικο, κορίτσι, κοριτσάκι etc etc
  2. Search non-accented version κοριτσι

Expected result

  1. Caps or LowerCase should bring similar results
  2. Accent / No Accent should be treated the same
  3. The autocomplete results are the same as the actual search

Actual result

  1. The autocomplete results, do not match the word pretty well
  2. The search results are different than the autocomplete results
  3. The products found in search are much less than what exists in the system

On the contrary the phonetic analysis is extremely good so typing koritsi will find the single product matching perfectly

Let me present some screens of searches

image




image




image




image




image





Partial works better

image

Autocomplete is a mess compared to search page

image

ioweb-gr avatar Jul 09 '22 08:07 ioweb-gr

Hi @ioweb-gr

about the last point where "Autocomplete is a mess compared to search page" :

you are typing "κορίτσι" in the autocomplete, right ?

this leads the engine to suggest two particular search terms in the autocomplete box. What you need to know here is the fact that products displayed in the autocomplete are displayed according to the suggested search terms.

romainruaud avatar Jul 11 '22 13:07 romainruaud

I see, this is counter intuitive. I would have never guessed it and I imagine the customers wouldn't either.

I would expect the autocomplete to fetch results based on the typed query along with listing the suggested terms in case the user wants to pick those.

Because as you can see here, the actual requested products are not listed and instead irrelevant products to what the user typed are listed, autocomplete seems broken.

ioweb-gr avatar Jul 11 '22 13:07 ioweb-gr

I can verify all the above. I have exactly the same issues in Greek and I'm searching for solutions. I'm focusing on modifying the elasticsearch analyzer but my knowledge at the moment in elasticsearch system is very poor... Important is that magento and elasticsearch at the moment is not a good solution out-of-the-box for Greek language. I'm curious what all other Greek magento eshops do on the subject. They haven't realized the problem yet?

cptX avatar Sep 01 '22 06:09 cptX

Any progress on this? I'm facing the same issue Magento ver. 2.4.5, Elasticsuite ver. 2.10.11 When I search for the word "κόρνα" it gives result, when I search for "κορνα" without accent it doesn't return any result. Also the phonetic search works ok, eg. "korna", returns the correct result! I'm really lost, I cannot understand the cause of the issue. What I know is that in one of my installations after a lot of experimentation I managed to make it work but I have no clue how and why, so I cannot replicate it in a new installation.

cptX avatar Oct 19 '22 13:10 cptX

Hi @romainruaud ,

at the moment searching is useless in Greek language and this is critical!!! Can you please help us what can we modify in order to make it work? For example for a product named "κόρνα" (with accent) when searching for the word "κορνα" (without accent) Ι get no results. Usually 99% of customers are typing without accents so you understand how important this issue is!!!

I want to modify some filters, and in other discussions I saw that this filter ICUTransformFilter should be ok, but as I'm new in magento and elasticsearch/elasticsuite I don't know where to see if this filter is used and included if not!

cptX avatar Oct 24 '22 08:10 cptX

Hi @cptX

I'll try to investigate more on this.

Actually we don't even have a default stemmer implemented for greek : https://github.com/Smile-SA/elasticsuite/blob/2.10.x/src/module-elasticsuite-core/etc/elasticsuite_analysis.xml

I'll try first to add one and see if this helps.

Regards

romainruaud avatar Oct 24 '22 08:10 romainruaud

Hi @romainruaud ,

I really appreciate your help and focus on this. I have found several references on the proble since 2017 if I'm not wrong, and no solution yet. So I think it's time to work on it!

Here is an example on the current problem:

Look this picture image

Autosuggest successfully finds the word "κόρνα" (with accent) but at the same moment seach can't find it , gives a message that "No exact results found for: 'κόρνα'." but still shows it succesfully in search results!

Now if I search for the word "κορνα" (without accent) autosuggest gives nothing at all! search gives again the message "No exact results found for: 'κορνα'. The displayed items are the closest matches." but shows the corresponding item!

image

Really confusing situation and critical too!

cptX avatar Oct 24 '22 09:10 cptX

Hi @ioweb-gr ,

I created a store view, configured it with greek language in the back-office, created two products :

  • κόρνα
  • κόρνα παιδιά

In the autocompletion I have this :

  • with accent :

image

  • without accent :

image

And the search results gives me :

  • with accent :

image

  • without accent :

image

So I can confirm that, despite having something different than you, there is a problem with matching between accentuated / non-accentuated characters.

I'll have a look.

Regards

romainruaud avatar Oct 24 '22 09:10 romainruaud

I have now the same results for κόρνα and κορνα when using the fix I provided in this PR : https://github.com/Smile-SA/elasticsuite/pull/2756/files

Can you check that it improves results on your side as well ? Beware, this require a cache clean + reindex after applying.

Regards

romainruaud avatar Oct 24 '22 09:10 romainruaud

Yes, that finally fixed everything, both autocomplete and search results!! Thanks!!! What do you suggest? Should I use the patch from now on, or do you plan to include it in a future update? Is there any update planned where this could be included?

cptX avatar Oct 24 '22 12:10 cptX

Regarding spelling correction, I have tested now the new patched setup and I'm afraid the system is not giving any results with one letter mispelled. For example if you search for the word "κώρνα" instead of "κόρνα" you get no results. Also for the word "δαχτιλίδι" instead of "δαχτυλίδι". Any clues about that? Is it relevant with the above problem?

cptX avatar Oct 24 '22 12:10 cptX

Hi, the fuzzy search (used in case of misspelling) works fine locally for me in search results :

image

Fuzzy is never applied in autocomplete however.

That being said, if the provided PR is fixing your issue, I'll merge this and tag a 2.10.12.1 release because we have another issue that should not wait to get release.

I let you upgrade to 2.10.12.1 then.

I'll close this issue as well because I consider that the initial issue is fixed by my PR, but feel free to open a new one to follow greek-language related issues, especially if it seems to still be some problems with the fuzzy search.

Regards

romainruaud avatar Oct 24 '22 12:10 romainruaud

"Κώρνα" returns nothing to me here in full search! This is my spellchecking configuration (default as far as I know) image Can you please tell me if my configuration is normal and if it's different than yours? If it's the same I'll open another issue!

Also, this is my relevance configuration image

Thanks in advance!

cptX avatar Oct 24 '22 13:10 cptX

Yes this is default configuration.

But it's also tied to which attributes are used for spellchecking.

By my side, κόρνα is contained in the "name" attribute, and this attribute is "used for spellcheck" according to the attribute configuration.

Can you please check if that's also the case for you, and if this problem is still occuring, please feel free to open a new issue.

Regards

romainruaud avatar Oct 24 '22 13:10 romainruaud

Yes Name attribute, has "Used in spellcheck" enabled in my configuration too! The name I have set up for this product is "Ηλεκτρική κόρνα 120dB με φόρτιση USB", so the word "κόρνα" is the second word of the name and not the first! Could this be the issue? Should spellchecking work for the complete name?

cptX avatar Oct 24 '22 14:10 cptX

image

This just works for me with the fix provided this morning.

Please open another ticket if you are still having the issue when using the 2.10.12.1 version of Elasticsuite.

Regards

romainruaud avatar Oct 24 '22 14:10 romainruaud