pdfparser does your package supports Arabic and Persian language?

does your package supports Arabic and Persian language?

Open mdoulabi1 opened this issue 1 year ago • 4 comments

does your package supports Arabic and Persian language? i test the languages but it show the sentense incorrect plz help thanks

Nov 07 '23 07:11 mdoulabi1

You have to be more specific, if you want help.

Which version did you try: latest master-branch or a certain version?

i test the languages but it show the sentense incorrect

Please provide example code (with actual Arabic/Persian language strings) or a PDF, which leads to incorrect output. Be more specific about the term "incorrect". What do you expect and what is the actual output.

Nov 07 '23 07:11 k00ni

when i wanna read a presain/arabic language the word start from end for example: correct text: سلام من یک برنامه نویس هستم incorrect text:مالس نم همانرب سیون متسه in english imagine that you wanna read hello but you get olleh

Nov 07 '23 08:11 mdoulabi1

I believe there are already issues about this topic. For example, https://github.com/smalot/pdfparser/issues/316. Summary: PDFParser is currently not able to parse languages properly, which are read from right to left. @GreyWyvern gave a good overview here: https://github.com/smalot/pdfparser/issues/316#issuecomment-1686461583

So my answer to your question is: No, it doesn't support these types of language. Although, is it practical to you if you just reverse the output again to gain the correct order of symbols?

Nov 07 '23 08:11 k00ni

I believe there are already issues about this topic. For example, #316. Summary: PDFParser is currently not able to parse languages properly, which are read from right to left. @GreyWyvern gave a good overview here: #316 (comment)

So my answer to your question is: No, it doesn't support these types of language. Although, is it practical to you if you just reverse the output again to gain the correct order of symbols?

i test it to reverse the word but it does not work and has a lot of challange

Nov 07 '23 09:11 mdoulabi1

pdfparser pdfparser copied to clipboard

does your package supports Arabic and Persian language?

pdfparser
pdfparser copied to clipboard