Many the fish

Results 132 comments of Many the fish

Hello @Soham1803, Thank you for your PR, I made several change requests before accepting to merge it, let me know if you need more informations

Hello @42plamusse, > Are the floating point numbers currently supported ? In the latin segmenter tests 32.3 is expected to become ["32", ".", "3"]. Yes, you're right, but the test...

Hello @Soham1803, The tests don't seem to work. Could you fix them? Thank you

Hello @hamano, What do you expect in terms of segmentation? Thank you!

Hello @hamano, I think the better way to solve this issue is to add the word SSL t[o the tokenizer dictionary](https://docs.rs/charabia/0.8.11/charabia/struct.TokenizerBuilder.html#method.words_dict). It seems to be a specific case more than...

Understood, we may disable it from the default features, thinking of it, this segmentation is mainly used for the cases you are describing, so if you want to change the...

Hey @hamano, no problem. I just wanted to say that if you want to use the feature, you can create a PR enhancing the behavior and I will accept it....

Hello @Kimeiga, I don't know if the traditional to simplified normalizer is relevant because the kvariant table already makes the relation between these two. Moreover, character_converter has some performance issues...

I've created an issue on Charabia to fix this: https://github.com/meilisearch/charabia/issues/290