dart_pdf icon indicating copy to clipboard operation
dart_pdf copied to clipboard

RTL and Script Unicode Font (ttf), and ligatures not supported

Open sh0umik opened this issue 4 years ago • 103 comments

Describe the bug Bangla Unicode Fonts gets Broken after generation of the PDF. The Unicode font works well in web. But its got broken when

I made a repository to demonstrate the problem. ( font included ). I have tried with 12 different Bangla Unicode font. Same result for all. Just run the program and you will see that the String in Text widget didnt get rendered properly.

https://github.com/sh0umik/flutter_pdf_test.git

Expected behaviour Expected behaviour is exactly like the test in Text widget

Screenshots Got this (broken) image

Expected this

image

Flutter Doctor Paste the output of running flutter doctor -v here.

[✓] Flutter (Channel beta, v1.12.13+hotfix.6, on Mac OS X 10.14.6 18G103, locale en-BD)
    • Flutter version 1.12.13+hotfix.6 at /Users/diablo/flutter
    • Framework revision 18cd7a3601 (3 weeks ago), 2019-12-11 06:35:39 -0800
    • Engine revision 2994f7e1e6
    • Dart version 2.7.0

[✓] Android toolchain - develop for Android devices (Android SDK version 29.0.2)
    • Android SDK at /Users/diablo/Library/Android/sdk
    • Android NDK location not configured (optional; useful for native profiling support)
    • Platform android-29, build-tools 29.0.2
    • Java binary at: /Users/diablo/Library/Application Support/JetBrains/Toolbox/apps/AndroidStudio/ch-0/191.6010548/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/java
    • Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)
    • All Android licenses accepted.

[✓] Xcode - develop for iOS and macOS (Xcode 11.1)
    • Xcode at /Applications/Xcode.app/Contents/Developer
    • Xcode 11.1, Build version 11A1027
    • CocoaPods version 1.8.1

[✓] Chrome - develop for the web
    • Chrome at /Applications/Google Chrome.app/Contents/MacOS/Google Chrome

[✓] Android Studio (version 3.5)
    • Android Studio at /Users/diablo/Library/Application Support/JetBrains/Toolbox/apps/AndroidStudio/ch-0/191.6010548/Android Studio.app/Contents
    • Flutter plugin version 42.1.1
    • Dart plugin version 191.8593
    • Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)

[!] IntelliJ IDEA Ultimate Edition (version 2019.3)
    • IntelliJ at /Users/diablo/Applications/JetBrains Toolbox/IntelliJ IDEA Ultimate.app
    ✗ Flutter plugin not installed; this adds Flutter specific functionality.
    ✗ Dart plugin not installed; this adds Dart specific functionality.
    • For information about installing plugins, see
      https://flutter.dev/intellij-setup/#installing-the-plugins

[✓] VS Code (version 1.41.1)
    • VS Code at /Applications/Visual Studio Code.app/Contents
    • Flutter extension version 3.7.1

[✓] Connected device (3 available)
    • ONEPLUS A6010 • 192.168.0.100:5555 • android-arm64  • Android 10 (API 29)
    • Chrome        • chrome             • web-javascript • Google Chrome 79.0.3945.88
    • Web Server    • web-server         • web-javascript • Flutter Tools

Desktop (please complete the following information):

  • [*] iOS
  • [*] Android
  • [*] Browser

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. all browser, sa]
  • Version [e.g. 22]

Additional context Many thing on my flutter web app depends on this plugin. Need to find a solution ASAP. Also i am converting the pdf.save() into a Uint8List so that it could be sent through firebase. Could this be a problem ? List to Uint8List conversation ?

sh0umik avatar Dec 30 '19 18:12 sh0umik

Sorry, I don't really see what's wrong in the rendering. I don't read Bengali and can't really spot the missing features. or is it just the bold face?

DavBfr avatar Dec 30 '19 18:12 DavBfr

image

There is ! Look at the picture. The broken text are joined text its like two latter joined and formed a single one. The formed/joined text is broken.

Flutter widget Text renders it correctly in web, andorid, ios .. but yours pdf lib widget Text does not render it correctly.

sh0umik avatar Dec 30 '19 18:12 sh0umik

Similar Problem is mentioned here .. But its for PHP pdf generation lib.

https://stackoverflow.com/questions/32421564/bangla-unicode-font-not-rendering-correctly-in-tcpdf

sh0umik avatar Dec 30 '19 18:12 sh0umik

Ok, I guess it's about GSUB support for OTF fonts: https://docs.microsoft.com/en-us/typography/opentype/spec/gsub

DavBfr avatar Dec 30 '19 18:12 DavBfr

I guess so.

https://docs.microsoft.com/en-us/typography/script-development/bengali

image

This is the problem I am getting whats left side fo -> but i need the output which is at the right side of the ->

sh0umik avatar Dec 30 '19 19:12 sh0umik

@DavBfr i tried the arabic-fonts branch. It didnt work for me :(

sh0umik avatar Dec 30 '19 19:12 sh0umik

No, the arabic-fonts branch uses some specific character replacement for Arabic only. GSUB is not yet implemented.

If you're willing to implement it in ttf_parser.dart I'll be happy to help.

Try something like this:

diff --git a/pdf/lib/src/ttf_parser.dart b/pdf/lib/src/ttf_parser.dart
index 3108a48..fc7cce0 100644
--- a/pdf/lib/src/ttf_parser.dart
+++ b/pdf/lib/src/ttf_parser.dart
@@ -54,6 +54,9 @@ class TtfParser {
     _parseCMap();
     _parseIndexes();
     _parseGlyphs();
+    if (tableOffsets.containsKey(gsub_table)) {
+      _parseGsub();
+    }
   }

   static const String head_table = 'head';
@@ -64,6 +67,7 @@ class TtfParser {
   static const String maxp_table = 'maxp';
   static const String loca_table = 'loca';
   static const String glyf_table = 'glyf';
+  static const String gsub_table = 'GSUB';

   final UnmodifiableByteDataView bytes;
   final Map<String, int> tableOffsets = <String, int>{};
@@ -368,4 +372,20 @@ class TtfParser {
       components,
     );
   }
+
+  void _parseGsub() {
+    print(fontName);
+    print(tableOffsets);
+
+    final int basePosition = tableOffsets[gsub_table];
+    print('GSUB Version: ${bytes.getUint32(basePosition).toRadixString(16)}');
+    final int scriptListOffset =
+        bytes.getUint16(basePosition + 4) + basePosition;
+    final int featureListOffset =
+        bytes.getUint16(basePosition + 6) + basePosition;
+    final int lookupListOffset =
+        bytes.getUint16(basePosition + 8) + basePosition;
+    print(
+        'GSUB Offsets: $scriptListOffset $featureListOffset $lookupListOffset');
+  }
 }

And see if your font contains any useful information.

DavBfr avatar Dec 30 '19 20:12 DavBfr

I am ready to implement it. Just tell me what to write ? It would be better if you could show me for just one character as example and where can i find the position to complete the rest then i can complete it. I am totally new to this.

sh0umik avatar Jan 03 '20 16:01 sh0umik

@DavBfr this is what i found running the code above. Whats next ?

I/flutter (10254): SiyamRupali
I/flutter (10254): {EBDT: 393284, EBLC: 393460, GDEF: 399344, GPOS: 398924, GSUB: 393740, LTSH: 3756, OS/2: 504, VDMX: 4552, cmap: 399388, cvt : 25408, fpgm: 24280, gasp: 393268, glyf: 25416, hdmx: 6056, head: 380, hhea: 436, hmtx: 600, kern: 381088, loca: 377928, maxp: 472, name: 381112, post: 385900, prep: 25396}
I/flutter (10254): GSUB Version: 10000
I/flutter (10254): GSUB Offsets: 393750 393832 394122

sh0umik avatar Jan 03 '20 16:01 sh0umik

Using this site called FontDrop

I can see the following...

image

Could it help implement the parser? If so then can you just guide me how to implement it ?

sh0umik avatar Jan 03 '20 17:01 sh0umik

This site is really useful, thanks!

So the first step is to parse this GSUB table to find all the lookups.

Then I think when you have this:

{
  "ligGlyph": 237,
  "components": [
    102,
    86
  ]
}

if we want to draw the glyphs 102 and 86 next to eachother, we replace with 237

DavBfr avatar Jan 03 '20 17:01 DavBfr

I think the tables to read for your issue is the Ligature Substitution Subtable:

https://docs.microsoft.com/en-us/typography/opentype/spec/gsub#lookuptype-4-ligature-substitution-subtable

Only one format, that should not be too difficult.

DavBfr avatar Jan 03 '20 17:01 DavBfr

Thank you for your ans. I am starting to understand how this works but I am totally new to this and have no idea about the variables in the parser. Can you just write a function or a bloc of code just to parse this as an example

{
  "ligGlyph": 237,
  "components": [
    102,
    86
  ]
}

maybe after that i can follow that code and implement the rest ?

sh0umik avatar Jan 03 '20 18:01 sh0umik

If you look at _parseGlyphs() in the same file, it will look the same: read the binary file content using bytes.getInt16(offset); or getInt32 and friends. Just follow the MS documentation to know what to read.

In the function parseGsub() your start offset is basePosition which gives you access to the offsets of the next tables : scriptList, featureList, lookupList.

The first to parse is the featureList described here: https://docs.microsoft.com/en-us/typography/opentype/spec/chapter2#flTbl

and get all the features with the right tag, maybe https://docs.microsoft.com/en-us/typography/opentype/spec/features_ae#blws

Then the lookupList will give the right glyphs to replace.

DavBfr avatar Jan 03 '20 18:01 DavBfr

Is there any chance you can add this feature soon ? I badly need this :( . Since i have no idea on the variable used in the parsing and how it works i think i wont be able to help much.

Eagerly waiting for your answer on this. Please, can find some of your spare time to add this feature ? Or Milestone ?

sh0umik avatar Feb 27 '20 19:02 sh0umik

@DavBfr what functions to use for find and replace the glyphs

if we want to draw the glyphs 102 and 86 next to eachother, we replace with 237

I could not find any substitution function in ttf_parser ? is it in charToGlyphIndexMap map ?

sh0umik avatar Feb 28 '20 04:02 sh0umik

how can i help you guys ?

Peet-A avatar Mar 07 '20 16:03 Peet-A

I think all the needed info is on this ticket. I don't have time to work on this now. Unless some of you are willing to pay for the feature. Then I could take a look at crowdfunding. This also includes Arabic and other languages.

DavBfr avatar Mar 07 '20 21:03 DavBfr

Please use this link: https://www.bountysource.com/issues/86211023-bangla-unicode-font-ttf-gets-broken-during-pdf-generation

DavBfr avatar Mar 23 '20 17:03 DavBfr

@DavBfr Hi, I'm able to prepare featureList , what to do next? should I prepare lookup table?

ashutosh1211 avatar Jun 19 '20 15:06 ashutosh1211

Yes, the lookup table is next. I have some code already for that. Let me push it. to the branch arabic-fonts

DavBfr avatar Jun 19 '20 15:06 DavBfr

Sure, :), I'm stuck at subtable parsing in lookup table.

ashutosh1211 avatar Jun 19 '20 15:06 ashutosh1211

I am also facing same problem while generating PDF. Bangla Unicode Characters are broken.

shofizone avatar Nov 16 '20 14:11 shofizone

Is there any chance you can add this feature soon ? I badly need this :( . Since i have no idea on the variable used in the parsing and how it works i think i wont be able to help much.

Eagerly waiting for your answer on this. Please, can find some of your spare time to add this feature ? Or Milestone ?

Did you found any solution?

pluzmedia avatar Jan 07 '21 14:01 pluzmedia

Hi @pluzmedia, JFYI: I achived the pdf for complex script layout by following steps

  1. Created html template layout using mustache5 standards
  2. Generated html filled with data using Mustache package
  3. Used the following code from printing package which directly open print dialog in supported platform:
await Printing.layoutPdf(
    onLayout: (PdfPageFormat format) async => await Printing.convertHtml(
          format: format,
          html: '<html><body><p>Hello!</p></body></html>', // pass generated html here
        ));

pratikmmohite avatar Jan 17 '21 10:01 pratikmmohite

Hi @pluzmedia, JFYI: I achived the pdf for complex script layout by following steps

  1. Created html template layout using mustache5 standards
  2. Generated html filled with data using Mustache package
  3. Used the following code from printing package which directly open print dialog in supported platform:
await Printing.layoutPdf(
    onLayout: (PdfPageFormat format) async => await Printing.convertHtml(
          format: format,
          html: '<html><body><p>Hello!</p></body></html>', // pass generated html here
        ));

Thanks you, I am doing the same.

pluzmedia avatar Jan 17 '21 15:01 pluzmedia

same problem here. Any update on this?

Ya-seeen avatar Apr 21 '21 20:04 Ya-seeen

@Ya-seeen no, I don't think anyone is working on it. You can look at the Arabic shaper class and implement the same, that would be much appreciated!

DavBfr avatar Apr 21 '21 20:04 DavBfr

Could you please provide me a link to this?

Ya-seeen avatar Apr 21 '21 20:04 Ya-seeen

Yes, it's this file: https://github.com/DavBfr/dart_pdf/blob/master/pdf/lib/src/pdf/arabic.dart You will have some replacements on Unicode codepoints to do, but we'll have to find a way to enable it, as this PdfArabic class is used only for right-to-left text direction.

Another more generic way is to implement proper font ligature parsing in https://github.com/DavBfr/dart_pdf/blob/master/pdf/lib/src/pdf/ttf_parser.dart. Ashutosh started this here: https://github.com/ashutosh1211/dart_pdf/commits/master

DavBfr avatar Apr 21 '21 22:04 DavBfr