PdfBox-Android icon indicating copy to clipboard operation
PdfBox-Android copied to clipboard

Bengali Font OCR not displayed correctly!

Open bosoxBrowser opened this issue 1 year ago • 1 comments

Sorry to say, my English is not good.

Bengali Font OCR not displayed correctly! Here is my code & result.

To reproduce Code snippet to reproduce the behavior:

PDFBoxResourceLoader.init(getApplicationContext());

        TextView textView = findViewById(R.id.textView);

        try (InputStream inputStream = getAssets().open("selina.pdf")) {

            // Create a PDDocument object from the input stream
            PDDocument document = PDDocument.load(inputStream);

            // Create a PDFTextStripper object to extract text from the document
            PDFTextStripper stripper = new PDFTextStripper();
            // Extract the text from the document
            String text = stripper.getText(document);

            // Display the text in the TextView
            textView.setText(text);

            // Close the document to free up resources
            document.close();

        } catch (IOException e) {
            e.printStackTrace();
        }

Expected behavior মোছাঃ সেলিনা বেগম কুড়িগ্রাম কিছমত আলী

Actual behavior মাছাঃ সিলনা বগম কুিড়াম কছমত আলী

Environment details:

  • PdfBox-Android version: [e.g. 2.0.27.0]
  • Android API version: [e.g. API 33]

bosoxBrowser avatar Apr 27 '23 08:04 bosoxBrowser

Please attach your PDF

THausherr avatar May 20 '23 14:05 THausherr