meteor-autocomplete icon indicating copy to clipboard operation
meteor-autocomplete copied to clipboard

can't search with utf-8 characters

Open ppillip opened this issue 11 years ago • 11 comments

plz

ppillip avatar Apr 13 '14 05:04 ppillip

Please provide some more information. What exactly goes wrong when you try to use UTF-8 characters?

mizzao avatar Apr 13 '14 15:04 mizzao

there is some sports name , for example, 'baseball' , 'football' , '태권도' , '권투'

'태권도', '권투' it do not work. (it's korean)

ppillip avatar Apr 15 '14 01:04 ppillip

I understand that, but "do not work" is not really going to help solve the problem. Do you see any errors in the console, etc.

mizzao avatar Apr 15 '14 02:04 mizzao

Unfortunately, there is no errors , i guess that "caretposition" problem. so, i decided to use typeahead.

ppillip avatar Apr 15 '14 02:04 ppillip

okay then.

Just so you are aware, typeahead is not backed by Meteor Collections. So you will find it quite a bit harder to use if you are pulling fetch() arrays out of Meteor.

If you decide to dig into this deeper, feel free to revisit this issue.

mizzao avatar Apr 15 '14 02:04 mizzao

I think I got a same problem though and type ahead looks working well simply so far. (And Meteor Collections or MongoDB has no issues also about the Character encoding stuff.)

Btw, so what I wanna know is that 'Autocomplete module has applied and followed this UTF-8 encoding already? In other word, when the module searches the character, isn't there any problem with another language without English?

Thanks to provide this kind of module, but hopefully it could be applied for all encoding types soon-

nicejwjin avatar Apr 17 '14 07:04 nicejwjin

Hi @nicejwjin, I'd be happy to help if you guys would create a demo app with some Korean characters to test with. I have no idea what the issue is right now and if you could narrow it down, I would be able to help you fix it.

We're using Meteor Collections to do a search with $regex directly so I don't see any reason why it shouldn't work if it is fine with Mongo/Minimongo.

mizzao avatar Apr 17 '14 17:04 mizzao

I've also ran into this problem, it is caused by the Regex when parsing Unicode

I then modify this line

new RegExp('(^|\\b|\\s)' + rule.token + '([\\w.]*)$')

to

return new RegExp('(^|\\b|\\s)' + rule.token + '([\\u2E80-\\uFB00\\w.]*)$');

Now it works perfectly with Unicoded Korean, Chinese and Japanese, hope this information helps!

Cheers!

LeePower avatar May 04 '15 03:05 LeePower

May be related to #85; will look into how to generalize.

mizzao avatar May 04 '15 06:05 mizzao

All right, this is related to this. Now the search works only on the Latin alphabet with the Cyrillic alphabet does not work.

superaleh avatar May 04 '15 08:05 superaleh

What is the regex that we should use to allow for Unicode with other characters? Obviously we have suggestions for Cyrillic and CJK already, but it would be nice to support whatever language.

mizzao avatar Jul 09 '15 13:07 mizzao