flutter-pdf-text icon indicating copy to clipboard operation
flutter-pdf-text copied to clipboard

cant extract proper local language(like kannada,tamil) text from pdf ??

Open rustiever opened this issue 5 years ago • 2 comments

i only tested in android. so the error might be from pdfBox

W/PdfBox-Android( 6641): No Unicode mapping for CID+222 (222) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+254 (254) in font TAUElangoArunthathi
I/chatty  ( 6641): uid=10281(com.example.text_audio) Thread-5 identical 2 lines
W/PdfBox-Android( 6641): No Unicode mapping for CID+254 (254) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+270 (270) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+270 (270) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+262 (262) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+262 (262) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+223 (223) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+223 (223) in font TAUElangoArunthathi

i think specifying the font while calling the method might solve. Just saying not sure

rustiever avatar Jun 18 '20 08:06 rustiever

i only tested in android. so the error might be from pdfBox

W/PdfBox-Android( 6641): No Unicode mapping for CID+222 (222) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+254 (254) in font TAUElangoArunthathi
I/chatty  ( 6641): uid=10281(com.example.text_audio) Thread-5 identical 2 lines
W/PdfBox-Android( 6641): No Unicode mapping for CID+254 (254) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+270 (270) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+270 (270) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+262 (262) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+262 (262) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+223 (223) in font TAUElangoArunthathi
W/PdfBox-Android( 6641): No Unicode mapping for CID+223 (223) in font TAUElangoArunthathi

i think specifying the font while calling the method might solve. Just saying not sure

Apparently there are some characters in the the font TAUElangoArunthathi that have no mapping for Unicode. So I guess that PdfBox can't turn them into plain text. Unfortunately I couldn't reproduce the error. I tried with a tamil pdf and PdfBox didn't complain. Maybe a similar error would present on iOS too.

AlessioLuciani avatar Jun 19 '20 14:06 AlessioLuciani

while parsing tamil pdf which font used by PdfBox??

rustiever avatar Jun 20 '20 10:06 rustiever