sanskrit_parser Not all rules work with spaces

Not all rules work with spaces

Open kmadathil opened this issue 6 years ago • 1 comments

Eg: "ity api" doesn't work as of now.

We have two options

add spaces (optional) into adeSa rules as in 7be2be230c9a08526e64cd9079c4d63e576ed7f5
find a better way.

I think 2. is feasible. I think we can do this

Find all spaces in the string, remember their positions
Remove all spaces
Build a list of forced break positions (where spaces would've been).
While doing the recursive split, use this list to break at the right spots.

Nov 15 '18 02:11 kmadathil

Actually, a simple fix based on a minor change to sandhi.py (removing spaces while checking sandhi candidates), seems to fix things. Please review the PR

Nov 17 '18 00:11 kmadathil

sanskrit_parser sanskrit_parser copied to clipboard

Not all rules work with spaces

sanskrit_parser
sanskrit_parser copied to clipboard