lucene icon indicating copy to clipboard operation
lucene copied to clipboard

Fixed incorrect Telugu normalization of vu వు to ma మ (

Open praveen-d291 opened this issue 7 months ago • 2 comments

Fixes: #14659

Remove incorrect Telugu వు/మ conflation in Indic Normalization. They look similar, but they are distinct with different meanings.

Currently "వు" is mapped to "మ" in IndicNormalizer in decompositions. This causes searches for "వెంకటరావు" to include "వెంకటరామ" even though they are different names.

I am a native speaker of Telugu language.

praveen-d291 avatar May 22 '25 09:05 praveen-d291

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

github-actions[bot] avatar May 22 '25 09:05 github-actions[bot]

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Jun 06 '25 00:06 github-actions[bot]

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Oct 16 '25 00:10 github-actions[bot]