pdfparser icon indicating copy to clipboard operation
pdfparser copied to clipboard

does your package supports Arabic and Persian language?

Open mdoulabi1 opened this issue 9 months ago • 4 comments

does your package supports Arabic and Persian language? i test the languages but it show the sentense incorrect plz help thanks

mdoulabi1 avatar Nov 07 '23 07:11 mdoulabi1

You have to be more specific, if you want help.

Which version did you try: latest master-branch or a certain version?

i test the languages but it show the sentense incorrect

Please provide example code (with actual Arabic/Persian language strings) or a PDF, which leads to incorrect output. Be more specific about the term "incorrect". What do you expect and what is the actual output.

k00ni avatar Nov 07 '23 07:11 k00ni

when i wanna read a presain/arabic language the word start from end for example: correct text: سلام من یک برنامه نویس هستم incorrect text:مالس نم همانرب سیون متسه in english imagine that you wanna read hello but you get olleh

mdoulabi1 avatar Nov 07 '23 08:11 mdoulabi1

I believe there are already issues about this topic. For example, https://github.com/smalot/pdfparser/issues/316. Summary: PDFParser is currently not able to parse languages properly, which are read from right to left. @GreyWyvern gave a good overview here: https://github.com/smalot/pdfparser/issues/316#issuecomment-1686461583

So my answer to your question is: No, it doesn't support these types of language. Although, is it practical to you if you just reverse the output again to gain the correct order of symbols?

k00ni avatar Nov 07 '23 08:11 k00ni

I believe there are already issues about this topic. For example, #316. Summary: PDFParser is currently not able to parse languages properly, which are read from right to left. @GreyWyvern gave a good overview here: #316 (comment)

So my answer to your question is: No, it doesn't support these types of language. Although, is it practical to you if you just reverse the output again to gain the correct order of symbols?

i test it to reverse the word but it does not work and has a lot of challange

mdoulabi1 avatar Nov 07 '23 09:11 mdoulabi1