Non-breaking spaces have different size
In this rendering:
First two words (o kliknutí) have a NBSP between them (czech typography rules). However, IMHO this only forbirds line breaks; there is no reason for this space to have a different size than other inter-word spaces on the same line.
You’re right. The current justification code adds extra space to space characters only. There are many places in the code when we assume that spaces are only "normal" spaces.
In this case, the problem is in: https://github.com/Kozea/WeasyPrint/blob/1aae1452da384de0160f599d4e2500aa1f3bebfc/weasyprint/layout/inline.py#L1119-L1125
We count spaces and use it to set justification_spacing. We should change our space detection here:
https://github.com/Kozea/WeasyPrint/blob/1aae1452da384de0160f599d4e2500aa1f3bebfc/weasyprint/layout/inline.py#L1131
And fix our spacing adjustment here: https://github.com/Kozea/WeasyPrint/blob/1aae1452da384de0160f599d4e2500aa1f3bebfc/weasyprint/text/line_break.py#L182-L196
For now we don’t have to support all justification opportunities as we don’t support text-justify, but we can at least support word separators. I’m not sure that a real list is actually defined in Unicode, as there are exceptions such as punctuation and fixed-width spaces. We can at least start with the list given by the specification.
I’m not sure that a real list is actually defined in Unicode, as there are exceptions such as punctuation and fixed-width spaces.
Pango provides is_expandable_space, we have everything we need to support this correctly.
Hello. I commited a fix for the issue. Should I write a test for it? If yes, how?
I did not use Pango's is_expandable_space, as it just searches for \u0020 and \u0040 .
Hi @luca-vercelli,
Thank for your commit.
I did not use Pango's
is_expandable_space, as it just searches for\u0020and\u0040.
You got the idea. It actually looks for \u0020 and \u00A0, that’s what we want, but you’re right, we can look for these characters by ourselves. I think we can even use a regular expression instead of iterating on the bytestring.
Could you please open a PR so that I can tweak a couple of things in your commit?
Should I write a test for it? If yes, how?
That would be great! You can add a test in tests/draw/test_text.py, copy test_text_align_justify and try different spaces (\u0020, \u00a0, \u202F…) and check that only the first two spaces get extra space when justified.
If you don’t get the logic of these tests, don’t worry, I’ll add one for you.
Fixed by #2390.