asciidoctor-fopub icon indicating copy to clipboard operation
asciidoctor-fopub copied to clipboard

Implement CJK support

Open ProgramFan opened this issue 9 years ago • 3 comments

Due to lack of fonts, current asciidoctor-fopub does not support CJK documents. CJK characters will be rendered as "#" in output pdfs. This is annoy for CJK users.

This pull request fixes the above problem using open source fonts and automatic language detection. To be specific, it implements the following functionalities:

  1. Download and install KaiGenGothic fonts when users request. KaiGenGothic fonts is the ttf version of Adobe Source Han Sans fonts created by akiratw? and released open-sourced. Since they are open sourced by legel owners, there are no copyright problems. Here I use the enhanced version (+bold, +italic, +bold_italic) created by chloerei at https://github.com/chloerei/asciidoctor-pdf-cjk-kai_gen_gothic.
  2. Add static font definitions into the original fop-config.xml. This makes sure the installed fonts is selected and selected by correct names.
  3. Add language detection in fopub to automatically detect and set language code to fop. The detection just scans the input xml for "xml:lang" attributes and default to "en".
  4. Add automatic font switching in fo-pdf.xsl. KaiGenGothic is used for CJK documents and the original fonts are used for other languages. This is done automatically.

I have tested it with README.adoc by setting lang attributes to zh_CN, zh_TW, ja and ko. The chapter titles and figure captions works fine. I use it for full Chinese documents for a while and it just works.

Caveat: The 'fopub' script check and modify fop-config.xml for font path on every invocation. It would be nice if there are other ways to do it. I have tried to use post-install hook but failed. One can still use gradle job to change the font path on first fopub invocation, but I am not quite into gradle. Help is welcome.

ProgramFan avatar May 07 '16 02:05 ProgramFan

@nicorikken @mojavelinux Would you please review this?

ProgramFan avatar May 11 '16 15:05 ProgramFan

Thanks for asking me for review, because it is sad to see PR's just hanging. Never having used this package, not clearly understanding the use-case, and not having any experience with CJK language support, I find it difficult to review this. But, based on your comments and the code, this seems like a good PR:

  • Proper method for language detection/selection.
  • Code structure improvement by moving off all the font definitions into a separate source file.

I'm not sure how the other fonts are handled, but I would opt for inclusion of fonts, rather than providing a manual GitHub download to execute. Maybe a proper CJK font-gem is already available, to refrain from writing and maintaining your own downloading solution?

Is there some way to check for CJK fonts present on the system, to prevent having to manually call specific directories? Another solution might be to load the fonts locally in a specific fonts directory, such that the fop-config.xml can be hardcoded. I don't know if manual edits on the fop-config.xml are needed or are typical, but having dynamically expended content in a file for manual edit can introduce some unexpected behavior for users.

I hope this helps you out for now, otherwise give me another heads up.

nicorikken avatar May 21 '16 11:05 nicorikken

Thanks for the comments, @nicorikken. It would be nice to be able to build a seperate CJK font-gem. However, since the font part belongs to docbook-xsl, which does not fit in any gem, I failed to find proper ways to distribute the fonts. The docbook-xsl part is installed by fopub at the first run.

I don't quite like the dynamically expanded @FONTBASE@, either. Maybe I can use gradle to modify it at the first run of fopub. But I don't quite understand the gradle system and specifically how to modify a file as a gradle task. Help is welcome.

For the reviewer, I am afraid may be we have to ask @mojavelinux for help.

ProgramFan avatar May 21 '16 13:05 ProgramFan

I don't want this program to be downloading fonts from unofficial sources, especially since the origin of those fonts isn't even clear.

I'd be okay with the Noto Fonts described here: https://docs.asciidoctor.org/pdf-converter/latest/theme/cjk/#obtain-the-ttf-fonts But I understand more fonts may still be needed. That is just outside the scope of this project. If you need this functionality, fork the project and add it to your own copy.

mojavelinux avatar Oct 26 '23 10:10 mojavelinux