Signal-Desktop icon indicating copy to clipboard operation
Signal-Desktop copied to clipboard

Unfortunate search result in the emoji search

Open yb66 opened this issue 5 years ago • 9 comments

  • [x] I have searched open and closed issues for duplicates

Bug Description

When I search for "pirate" in the emoji search box I get 2 results, the expected skull and crossbones flag, and the flag of the United Arab Emirates.

I can see how it might happen if I squint a bit (some kind of soundex function?) but I can also see how it might offend people and/or appear deliberate.

Steps to Reproduce

  1. Click on emoji button
  2. Click on search tool in menu
  3. Type in pirate

Actual Result:

2 results, skull and crossbones flag and UAE flag.

Expected Result:

No UAE flag, and (at least) the skull and crossbones flag.

Screenshots

Screenshot 2020-05-21 at 15 11 08

Platform Info

Signal Version:

v1.33.1

Operating System:

Mac 10.14.6

Linked Device Version:

Link to Debug Log

Regards, iain

yb66 avatar May 21 '20 06:05 yb66

Looks like the emoji search uses fuse.js, and can be reproduced with their demo using Signal's configuration. Even te matches pirate and flag_ae here due to the fuzzy search (IIUC similar cause to #4235)

jsantell avatar Jun 02 '20 20:06 jsantell

Hi @jsantell,

Thanks. Where would I get the emoji data to load in the demo? I can't find it. Perhaps one of the npm packages (https://github.com/iamcal/emoji-data)?

I'll cross post this to the fuse.js issue tracker once I've got the demo to work so they can reproduce/test it.

Regards, iain

yb66 avatar Jun 03 '20 05:06 yb66

The emoji data is at ./sticker-creator/dist/bundle.js via emoji-datasource and emoji-datasource-apple modules, but can be reproduced by swapping out the strings in the fuse demo. I imagine the fuzzy search is working correctly (i.e. not an issue with fuse), this is just the outcome of lax settings in a fuzzy search

jsantell avatar Jun 03 '20 15:06 jsantell

Thanks @jsantell, much appreciated.

Regards, iain

yb66 avatar Jun 04 '20 10:06 yb66

Can confirm that the UAE flag is pulling up on Windows 10 as well (1.36.3).

KeronCyst avatar Oct 10 '20 04:10 KeronCyst

I didn't get very far with getting the demo to work, if anyone else can that would be good.

Regards, iain

yb66 avatar Oct 15 '20 08:10 yb66

I didn't get very far with getting the demo to work, if anyone else can that would be good.

Regards, iain

You can check my example there as a starting point: https://github.com/signalapp/Signal-Desktop/issues/4235#issuecomment-738821978

hiqua avatar Dec 04 '20 14:12 hiqua

Looks like the emoji search uses fuse.js, and can be reproduced with their demo using Signal's configuration. Even te matches pirate and flag_ae here due to the fuzzy search (IIUC similar cause to #4235)

From what I gather fuse.js will match anything, it's just that it will give a very bad score (i.e. 1.0, and 0.0 is perfect match) if the match is not a real one.

In the case of this example, using your links (thanks!), we can see that te matches flag_ae with a score of 0.55, so a very bad match (I'm guessing anything at 0.5+ does not really match from the quick look I had).

There is a threshold parameter which is supposed to discard these bad matches. In the case of Signal, it's set to 0.2, so this result should not even appear.

So my guess is that Signal-Desktop is somehow reusing the previous results instead of recomputing the fuzzy search, which would not suggest flag_ae.

hiqua avatar Dec 04 '20 14:12 hiqua

With Signal 7.43 under Microsoft Windows 11, I get 9 results for "pirate", some of which seem pretty irrelevant, but none is flag-ae. Can you still reproduce the specific result you reported?

Chealer avatar Feb 25 '25 18:02 Chealer

@Chealer No, I get just the pirate flag. Signal v7.49(654) on iOS.

yb66 avatar Mar 12 '25 01:03 yb66

Seems fixed? Can be closed

5HT2 avatar May 15 '25 01:05 5HT2

Quick explainer before I close this:

Every emoji has a bunch of different "tags" that we are searching against. For example, :ocean: 🌊 has all of these tags:

"tags": ["ocean", "kanagawa", "nature", "surf", "surfer", "surfing", "water", "water wave", "wave"]

Then we use a search library that does a "fuzzy" match against all of these tags. It will go through every one of them and assign them a "score" based on how similar the words are.

Similarity is based on how much the order of the letters in the search term match the order of the letters in the tags.

This can lead to some pretty ridiculous results, for example, if you search for "hater" you will get a lot of results that are tagged "water" (i.e. :watermelon:, :ocean:, :pouring_liquid:, etc) because the order of the letters in "hater" is very similar to the order of the letters in "water"

Image

But we do this because it helps you from having to perfectly type the name of the emoji you are searching for, you can find :watermelon: even though you typed "water melon" or "melon" or "watemel"

In the case of :flag_ae: 🇦‍🇪 , it currently has these tags:

"tags": ["flag ae", "flag", "united arab emirates", "uae"]

So when you are searching for "pirate" you'll get results like this (I had to change our current settings a little bit to get these results):

emoji tag matches score
:pirate_flag: "pirate flag" pirate flag very high
:parrot: "pirate" pirate very high
:flag_ae: "united arab emirates" united arab emirates low
:lungs: "respiration" respiration low

We tweak the settings/logic that we use every now and then, and we do bias heavily towards "exact prefix matches"

For now, the logic cuts the results off before :flag_ae: or :lungs:, but I can't promise that there aren't other unfortunate results that may pop up, or there won't be in the future. Especially when you consider that there are tens of thousands of these tags that have been translated to every language we support

If anyone would ever like to play around with the settings more, you can see the code in useFunEmojiSearch.tsx and you can see the documentation for our search library here: https://www.fusejs.io/api/options.html#fuzzy-matching-options

jamiebuilds-signal avatar May 15 '25 16:05 jamiebuilds-signal

Thank you for the helpful info Jamie!

I forgot to attach the recording of the issue to my last reply, for context for the rest of the thread, the repro of the issue being fixed is attached:

Emoji completion test

5HT2 avatar May 16 '25 22:05 5HT2