gurmukhi-utils icon indicating copy to clipboard operation
gurmukhi-utils copied to clipboard

Transliteration development - getting rid of extra a's

Open preetcharan opened this issue 4 years ago • 4 comments

Describe the bug Extra a's coming up in words of transliterations, have given a couple of examples below but if more needed let me know.

To Reproduce Steps to reproduce the behavior: Search: kmkp (kaljug meh keertan pardhaanaa

Translit that you get: kalajug meh keeratan paradhaanaa |

What I would like: Get rid of the extra a's

Change to: kaljug meh keertan pardhaanaa |

Another example:

guramukh japeeai laae dhiaanaa |

change to:

gurmukh japeeai laae dhiaanaa |

Specs

  • OS 2.9.0
  • Database 4.7.0

preetcharan avatar Jan 21 '21 12:01 preetcharan

What rule can you use to get rid of the extra characters? If it's something you can explain to a 5 year old we can probably program it in.

bhajneet avatar Jan 21 '21 12:01 bhajneet

I think somewhere there is a rule that is adding in an "a" after certain characters. So removing thst if you know the pattern or know the rules should be easy. Shall I give more examples to help you see the pattern?

On Thu, 21 Jan 2021 at 12:13, Bhajneet S.K. [email protected] wrote:

What rule can you use to get rid of the extra characters? If it's something you can explain to a 5 year old we can probably program it in.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/shabados/gurmukhi-utils/issues/186#issuecomment-764602239, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJY6O2FT5JZXSS22WCPUU5DS3AK57ANCNFSM4WMXYZJQ .

preetcharan avatar Jan 21 '21 12:01 preetcharan

Issue is if you remove all a then you will get things like

Stnaam, naank, prsaad

How does it know when to add the a between two consonants and when not to?

It has to do with compound words which we could go through our Gurbani and add hyphens for transmit purposes which then get stripped out but that's a lot of work and a bit odd

What do you think @Harjot1Singh and @Sarabveer ?

bhajneet avatar Jan 21 '21 15:01 bhajneet

note to self: test a list of 4 letter words with no vowel between consonants 2 and 3.

bhajneet avatar Jun 12 '21 22:06 bhajneet

Better served with a syllabification function and manual handling of syllable boundaries using the interpunct in DB

bhajneet avatar Jul 21 '23 14:07 bhajneet