gurmukhi-utils
gurmukhi-utils copied to clipboard
Transliteration development - getting rid of extra a's
Describe the bug Extra a's coming up in words of transliterations, have given a couple of examples below but if more needed let me know.
To Reproduce Steps to reproduce the behavior: Search: kmkp (kaljug meh keertan pardhaanaa
Translit that you get: kalajug meh keeratan paradhaanaa |
What I would like: Get rid of the extra a's
Change to: kaljug meh keertan pardhaanaa |
Another example:
guramukh japeeai laae dhiaanaa |
change to:
gurmukh japeeai laae dhiaanaa |
Specs
- OS 2.9.0
- Database 4.7.0
What rule can you use to get rid of the extra characters? If it's something you can explain to a 5 year old we can probably program it in.
I think somewhere there is a rule that is adding in an "a" after certain characters. So removing thst if you know the pattern or know the rules should be easy. Shall I give more examples to help you see the pattern?
On Thu, 21 Jan 2021 at 12:13, Bhajneet S.K. [email protected] wrote:
What rule can you use to get rid of the extra characters? If it's something you can explain to a 5 year old we can probably program it in.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/shabados/gurmukhi-utils/issues/186#issuecomment-764602239, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJY6O2FT5JZXSS22WCPUU5DS3AK57ANCNFSM4WMXYZJQ .
Issue is if you remove all a then you will get things like
Stnaam, naank, prsaad
How does it know when to add the a between two consonants and when not to?
It has to do with compound words which we could go through our Gurbani and add hyphens for transmit purposes which then get stripped out but that's a lot of work and a bit odd
What do you think @Harjot1Singh and @Sarabveer ?
note to self: test a list of 4 letter words with no vowel between consonants 2 and 3.
Better served with a syllabification function and manual handling of syllable boundaries using the interpunct in DB