nyxt icon indicating copy to clipboard operation
nyxt copied to clipboard

Requiring 3 characters to search causes multilingual (CJK) problems

Open acidtoyman opened this issue 1 year ago • 13 comments

Requiring 3 characters to perform a search makes searching in some languages difficult. CJK languages in particular have very dense writing—one- and two-character words are extremely common.

For instance, I was trying to find a way to insert a symbol into a TeX document for an inkan(印鑑)—one's official seal stamp in Japan, used similar to signatures in the West. I need this for a document I'm going to be submitting to court. I found a site that should have a solution, but the page is quite long, so I tried to C-s C-s "印鑑". It took me a while to figure out what the problem was, because as soon as I'd finished inputting the character, I pressed "Enter", which closes the minibuffer (so I assumed it was a another multilingual input issue).

Is there a reason for the 3-character minimum? Is there a way to override it? I've never used a browser that required a minimum number of characters to search.

acidtoyman avatar Aug 18 '22 08:08 acidtoyman

We introduced the character minimum restriction to now overwhelm the renderer with the amount of elements to search/create. But this restriction is indeed problematic. Luckily, you can lift if by adding this piece of configuration if you're on 3.*:

(define-configuration nyxt/search-buffer-mode:search-buffer-source
  ((nyxt/search-buffer-mode:minimum-search-length 1)))

or, if you're on 2.*:

(define-configuration nyxt/web-mode:search-buffer-source
  ((nyxt/web-mode:minimum-search-length 1)))

aartaka avatar Aug 18 '22 08:08 aartaka

I'm using 3 pre-release 1. Apparently "The auto-config file is now suffixed with the major version number", so it's ignoring my config file. Do I call it "init3.lisp" or something?

acidtoyman avatar Aug 18 '22 09:08 acidtoyman

Okay, I saw in another thread that it uses the "config.lisp" file now. It works, although:

a) it still says "Search buffer (3+ characters)" (even though it starts searching after 1)

b) it won't show search results unless pressing a key such as one of the down/left/up/right arrows or TAB. (this is only after CJK entry, not with Roman characters)

acidtoyman avatar Aug 18 '22 09:08 acidtoyman

Huh, that is weird 0_o

aartaka avatar Aug 18 '22 11:08 aartaka

We need a better fix for this.

What about this:

  • Start searching from 2 characters.
  • Start searching from 1 character if it's not [A-Za-z0-0]. But this rule still seems too latin-centric...

Firefox allows single-char searches... Maybe we should just lift the restriction?

Ambrevar avatar Aug 19 '22 04:08 Ambrevar

Is the renderer really all that overwhelmed without the character minimum? I loaded up this (the full text of Volume 1 of 細雪 Sasameyuki by 谷崎潤一郎 Tanizaki Jun'ichirō), and when I searched for 花 (hana, "flower") on my 9-year-old ThinkPad with an i5-3320m, the results were near-instantaneous.

Although, as I said above, the results don't appear until I've pressed a key that doesn't insert a character: arrow keys, TAB, INSERT, HOME, etc.

acidtoyman avatar Aug 19 '22 05:08 acidtoyman

Our old search implementation was slow, maybe we can remove this limitation now indeed...

Ambrevar avatar Aug 19 '22 06:08 Ambrevar

I'm just fine with lifting the restriction!

aartaka avatar Aug 19 '22 12:08 aartaka

Done with c2354c59f1f088977106325fcd0911915f447786.

Ambrevar avatar Sep 13 '22 15:09 Ambrevar

a) it still says "Search buffer (3+ characters)" (even though it starts searching after 1)

I can't reproduce.

Ambrevar avatar Sep 13 '22 15:09 Ambrevar

b) it won't show search results unless pressing a key such as one of the down/left/up/right arrows or TAB. (this is only after CJK entry, not with Roman characters)

I don't have a CJK input at hand here.

I tried with Control-shift-u e9 to insert é and here I have to press space to turn the ue9 into é. From the moment the é appears, the search is performed.

Ambrevar avatar Sep 13 '22 15:09 Ambrevar

Well, hopefully the performance is good. Probably where it will be problematic is on multi buffer search... however, the user can always change things.

jmercouris avatar Sep 13 '22 20:09 jmercouris

multi-buffer-search itself can also change this setting locally ;)

Ambrevar avatar Sep 14 '22 06:09 Ambrevar