sanskrit_parser icon indicating copy to clipboard operation
sanskrit_parser copied to clipboard

Input Encoding choices

Open am4096 opened this issue 4 years ago • 4 comments

SOLVED THIS ISSUE by using Devanagari instead of DEVANAGARI. Have other problems, though. (Abhijit)

Option 1

from sanskrit_parser.base.sanskrit_base import SanskritObject, DEVANAGARI

Option 2

parser = Parser(input_encoding="DEVANAGARI", output_encoding=output_encoding, replace_ending_visarga='s') parse_result = parser.parse(... name of input string here ...)

When using Sanskrit Object in Option #1, there is no problem even when the input is on Devanagari. However, for the second situation, there is an error message generated which says that: /usr/local/lib/python3.6/dist-packages/sanskrit_parser/api.py in init(self, strict_io, input_encoding, output_encoding, lexical_lookup, score, split_above, replace_ending_visarga, fast_merge) 89 self.strict_io = strict_io 90 if input_encoding is not None: ---> 91 self.input_encoding = SCHEMES[input_encoding] 92 else: 93 self.input_encoding = None

KeyError: 'DEVANAGARI'

Can you please suggest how I can use the second option, with text in Sanskrit instead of SLP1.

am4096 avatar Aug 01 '20 03:08 am4096

SOLVED THIS ISSUE by using Devanagari instead of DEVANAGARI. Have other problems, though.

(Abhijit)

am4096 avatar Aug 01 '20 16:08 am4096

Hi Abhijit,

Please report the other issues so we can look at them,

Thanks!

On Sat, Aug 1, 2020, 9:43 AM am4096 [email protected] wrote:

SOLVED THIS ISSUE by using Devanagari instead of DEVANAGARI. Have other problems, though.

(Abhijit)

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/kmadathil/sanskrit_parser/issues/143#issuecomment-667557709, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKEWNX4WUWWDU4TVYF4ZVLR6RAZLANCNFSM4PRKKODA .

kmadathil avatar Aug 02 '20 02:08 kmadathil

There is an inconsistency in the naming convention of indic_transliteration, which uses lowercase (https://github.com/sanskrit-coders/indic_transliteration/blob/0a43c6e0dc852bf13d80d7cd7a94fe9050b761c7/indic_transliteration/sanscript/schemes/brahmic/northern.py#L116), versus sanskrit_base, where we capitalize the first letter vs the python variables which are all upper case. I think it may be good to try to simplify it for users, so that they don't have to remember the right case. Maybe we should always use the python variable in the examples and the API. E.g.

from sanskrit_parser.base.sanskrit_base import DEVANAGARI
parser = Parser(input_encoding=DEVANAGARI,
output_encoding=output_encoding,
replace_ending_visarga='s')

avinashvarna avatar Aug 03 '20 13:08 avinashvarna

Agree. Capitalizing the first letter in the string may not have been a great choice. Let's go with your idea of making the examples and the API use the uppercase Python variables.

kmadathil avatar Aug 03 '20 18:08 kmadathil