fpdf2 icon indicating copy to clipboard operation
fpdf2 copied to clipboard

{nb} breaks if text shaping is turned on with certain fonts

Open catsclaw opened this issue 1 year ago • 9 comments

The special {nb} code fails with some fonts when text shaping is turned on.

Minimal code Please include some minimal Python code reproducing your issue:

pdf = FPDF(format='letter')
pdf.add_font('gentium', style='', fname='GenBkBasR.ttf')
pdf.add_page()
pdf.set_font('gentium', '', 24)
pdf.set_text_shaping(True)
pdf.write(text='Pages {nb}')
pdf.ln()
pdf.set_text_shaping(False)
pdf.write(text='Pages {nb}')
pdf.output('test.pdf')

Result image Environment Please provide the following information:

  • Operating System: Ubuntu
  • Python version: 11
  • fpdf2 version used: 2.7.7

catsclaw avatar Jan 12 '24 19:01 catsclaw

Thank you for the report @catsclaw

You are right, those two features are currently incompatible.

The reason is that with test shaping, each character is rendered individually in the PDF (with a dedicated Tj operator for each letter). But in FPDF._substitute_page_number() we look for the sequence {nb} to be present inside a single "PDF string" (rendered by a single Tj operator).

As a consequence, this is currently a limitation in fpdf2. We should mention it in our documention (in docs/PageBreaks.md). And PRs are welcome to implement this feature also when text shaping is enabled!

Would you be interested to contribute regarding this @catsclaw? (docs improvement and/or implementation)

Lucas-C avatar Jan 12 '24 20:01 Lucas-C

The characters will be rendered as a sequence if they are only moving on the x axis by the character length, but if there is any offset (kerning, etc) we need to adjust the text matrix and make individual Tj. That's why only some fonts will have this problem.

andersonhc avatar Jan 12 '24 20:01 andersonhc

Adding to this issue:

  • When you have alias ({nb}) in the text in a multi cell with alignment justified, the line width will be calculated with the alias size instead of the final number, so your text won't be correctly justified
  • If multi cell breaks the alias in 2 different lines it won't be replaced by the number of pages.

Example:

from fpdf import FPDF
text="Lorem ipsum dolor sit amet, {nb} {nb} {nb} {nb} {nb} {nb} consectetur adipiscing elit. {nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}Mauris sit amet lacus ut ex tincidunt vulputate non nec mauris. Lorem ipsum dolor sit amet, consectetur adipiscing elit."
pdf = FPDF()
pdf.add_page()
pdf.set_font("helvetica", "", 24)
pdf.multi_cell(w=pdf.epw, text=text, align="J", new_x="LEFT")
pdf.output('test_nb.pdf')

Result: issue_nb

The problem is the replacement is done directly in the page content after all the rendering is done. I don't see an obvious way to fix it and it will probably demand a lot of rework on how output works.

andersonhc avatar Jan 27 '24 14:01 andersonhc

The underlying problem here is that an otherwise legitimate sequence of text characters is given a special meaning under certain circumstances. This was bound to result in conflicts somewhere down the line.

The clean solution would be to use a reserved Unicode character for this purpose, which can't otherwise appear in renderable text. A practical approach might be to convert self.str_alias_nb_pages into a special Glyph() subtype (say NbGlyph()) during text parsing. When rendering, NbGlyph() then inserts a sequence of three or four of this reserved Unicode character. And before writing the file, the reserved character sequences get replaced with the right sequence of digit glyphs.

Or am I missing some basic obstacle here? Yes, various places in the code need to learn about this special case, but that is kind of inevitable if we want to avoid conflicts.

gmischler avatar Feb 22 '24 14:02 gmischler

Just reporting @andersonhc comment there from https://github.com/py-pdf/fpdf2/issues/71#issuecomment-2123516795:

I am going to work on this issue soon and we will hopefully have a final solution before the next release.

In the meantime I can suggest 2 workarounds:

  • Using a different font on your footer
  • Using a different page number alias - i.e. calling self.alias_nb_pages(alias="####") and writing f"{self.page_no()}/####" on your trailer

Lucas-C avatar May 24 '24 08:05 Lucas-C

Would it be possible in the meantime to (loudly) call this out in the docs for both set_text_shaping() and alias_nb_pages()? I just spent quite a while chasing down the reason for {nb} failing to replace in unpredictable ways.

The underlying cause makes perfect sense in retrospect now that I understand what's happening, but in the moment, man that was a weird problem to chase down. :sweat_smile:

smitelli avatar Jun 13 '24 14:06 smitelli

Would it be possible in the meantime to (loudly) call this out in the docs for both set_text_shaping() and alias_nb_pages()?

Yes, I think that is a good idea 🙂 Would you like to submit a PR?

Lucas-C avatar Jun 13 '24 18:06 Lucas-C

~~This fixup has been included in the new release published today:~~ ~~https://github.com/py-pdf/fpdf2/releases/tag/2.8.2~~

Lucas-C avatar Dec 16 '24 12:12 Lucas-C

My mistake: the fix in PR https://github.com/py-pdf/fpdf2/pull/1288 did not solve the underlying issue, it just re-enabled the workaround.

Re-opening this now.

Lucas-C avatar Feb 24 '25 16:02 Lucas-C