Support border collapse in bidi context
Attempt to pass the following tests in WeasyPrint test suites:
- http://test.weasyprint.org/suite-css21/chapter17/section18/test1/
- http://test.weasyprint.org/suite-css21/chapter17/section18/test2/
- http://test.weasyprint.org/suite-css21/chapter17/section18/test3/
- http://test.weasyprint.org/suite-css21/chapter17/section18/test4/
The WeasyPrint test states that:
The two tables should be identical:
The 2 rendered tables in each of the following 2 tests are not exactly identical:
- test_border_collapse_bidi_border_top
- test_border_collapse_bidi_border_bottom
There is a 1 pixel difference in the cell background and horizontal border depending on the drawing direction. If necessary it can be fixed by reversing the drawing direction for rtl:
diff --git a/weasyprint/draw.py b/weasyprint/draw.py
index 6f678358..9f38ee23 100644
--- a/weasyprint/draw.py
+++ b/weasyprint/draw.py
@@ -826,7 +826,11 @@ def draw_table(stream, table):
draw_background(stream, row_group.background)
for row in row_group.children:
draw_background(stream, row.background)
- for cell in row.children:
+ if table.style['direction'] == 'rtl':
+ cells = reversed(row.children)
+ else:
+ cells = row.children
+ for cell in cells:
if (table.style['border_collapse'] == 'collapse' or
cell.style['empty_cells'] == 'show' or
not cell.empty):
@@ -951,11 +955,15 @@ def draw_collapsed_borders(stream, table):
score, style, width, color, 'top',
(pos_x1, pos_y, pos_x2 - pos_x1, 0)))
- for x in range(grid_width):
+ if table.style['direction'] == 'rtl':
+ grid_width_range = range(grid_width - 1, -1, -1)
+ else:
+ grid_width_range = range(grid_width)
+ for x in grid_width_range:
add_horizontal(x, 0)
for y in range(grid_height):
add_vertical(0, y)
- for x in range(grid_width):
+ for x in grid_width_range:
add_vertical(x + 1, y)
add_horizontal(x, y + 1)
Thanks for the pull request.
This one is really complicated! There’s at least one big problem with your PR: box position can’t depend on the direction, coordinates are always relative to the left and the top of the layout. It means that, with rtl direction, the left position has to be set on the box, even if its layout depends on its right side and the right side of its parent.
Maybe the best way to handle this correctly is to handle CSS Logical first, so that we can use left/right and start/end correctly. And maybe it would require to support CSS Writing Modes first.
That’s definitely a looooot of work! 😱
If you find a way to handle rtl border collapse without changing *_box_x functions, we’ll merge this PR with pleasure. But you’ve been warned: it may be "a bit" complex 😄.
(Note that I’ve also tried to fix #2055, but not pushed my code yet. I’ll clean my code and push it soon.)
I’ve open #2058 so that you can see what I’ve managed to get. It’s far from perfect, and as explained above I’m not sure that we can easily get a "clean" solution.
(I should have pushed this code earlier.)
Feel free to "steal" some of my code if you want!
If you find a way to handle rtl border collapse without changing
*_box_xfunctions
Currently changing content_box_x to:
https://github.com/Kozea/WeasyPrint/blob/2b0930c601b7f4fd9a9130e5881f3326bbe8a9a7/weasyprint/formatting_structure/boxes.py#L172-L179
appears to "solve" #2055 and has not failed any test which gives the impression that there are no problems with the PR.
Appreciate if you can point out available test cases that fail it or some reference so that relevant test cases can be written.
Appreciate if you can point out available test cases that fail it or some reference so that relevant test cases can be written.
Here’s an example:
<div style="float: left; border-right: 100px solid black; direction: rtl">abc</div>
x is by definition the distance from the left, it doesn’t depend on the text direction. Believe me, that’s a really bad idea to change this. 😄
If you find a way to handle rtl border collapse without changing
*_box_xfunctions
Didn't change *_box_x functions but handled rtl for TableCellBox in avoid_collisions as special case.
Commit 8ef14df also added test case based on https://github.com/Kozea/WeasyPrint/pull/2058#issuecomment-1932436107
Thanks for your work on this.
Could you please avoid to change draw.py and boxes.py? What we want is to fix the layout. Changing what x means is a workaround, and adding some code in draw is a workaround to fix the workaround. I’m afraid to maintain this. 😄
I can help with that if you prefer!
adding some code in
drawis a workaround to fix the workaround
I would disagree that the code added to draw is a workaround but rather a deliberate attempt to always draw from left to right when boxes are arranged from right to left.
When leftmost cell is drawn first, the rightmost cell will "overlap" the leftmost cell if the position_x and/or width is not absolute numbers. Likewise, when rightmost cell is drawn first, the leftmost cell will "overlap" the rightmost cell.
This is not noticed in pdf as I think it's rendered in "vector" but when rendered as pixel, the one pixel difference, bounded in red, can be seen when zoom in:
- The
column_positionsandcolumn_widthsvalues are similar but in reverse order: a.[3.5, 28.78125]and[25.28125, 86.015625]for top table b.[28.78125, 3.5]and[86.015625, 25.28125]for bottom table - The top table is drawn from left-to-right, the yellow background (from right cell) overlaps the aqua background.
- The bottom table is drawn from right-to-left, the aqua background (from left cell) overlaps the yellow background.
This is not noticed in pdf as I think it's rendered in "vector" but when rendered as pixel, the one pixel difference, bounded in red, can be seen when zoom in:
Oh, OK, I understand now. Fighting against rasterization problems is a well-known problem in WeasyPrint. 😄
For tests, we usually build our cases in a way that positions and sizes are are integers. That’s sometimes a bit frustrating, but we prefer to have complexity in the way tests are built than in the library’s code.
For real-life documents, we usually don’t care about rasterization artifacts. The different problems we can get depend on the rendering library, OS, screen size, zoom level, and other crazy reasons (including hardware, true story 😁). So, we just can’t handle this correctly and reliably. We tried hard in the past, and it’s often caused more harm than we thought…
If you think that your code is ready to review, I’ll take care of cleaning tests!
If you think that your code is ready to review
After putting in more thoughts, I would like to appeal to you to reconsider commit bb14490 or something along that line instead of the latest commit. The latest commit is merely replicating this logic:
https://github.com/Kozea/WeasyPrint/blob/bb144907c070a88fd120f518712aa8345067c951/weasyprint/formatting_structure/boxes.py#L57-L64
to layout/table.py:
https://github.com/Kozea/WeasyPrint/blob/a7f355c83fb3775a65efa3a6252e3fc77efaffb5/weasyprint/layout/table.py#L35-L57
and layout/float.py:
https://github.com/Kozea/WeasyPrint/blob/a7f355c83fb3775a65efa3a6252e3fc77efaffb5/weasyprint/layout/float.py#L158-L168
Although you had made it clear to avoid changing *_box_x functions, I feel putting the logic in content_box_x conveys the intention more clearly as opposed to repeating the logic in layouts.
I was naive to assume this logic applies to all boxes until you pointed out https://github.com/Kozea/WeasyPrint/pull/2056#issuecomment-1932554075. Thus, I tried a mixin approach to make it clear that only relevant boxes (TableBox and TableCellBox) handle rtl. If there are other boxes that need to handle rtl moving forward, then just inherit BidiAware.
I have finally found a solution that is cleaner from my point of view. Both the tree building and the drawing parts are now independent from direction, the rtl logic is only in the table layout. The drawing step always gets left-to-right structures for borders and column widths/positions, where x is the horizontal position from the left edge of the page, according to the common drawing libraries.
Even if this PR is finally not merged, thanks a lot for your work and for your ideas, it’s been very useful to fix this problem. ❤️
The drawing step always gets left-to-right structures for borders and column widths/positions, where x is the horizontal position from the left edge of the page, according to the common drawing libraries.
I think the cells drawing step for rtl table still don't get left-to-right structures, probably due to:
https://github.com/Kozea/WeasyPrint/blob/e7c44e5fcdd5c7a1bf9b453c772d52c9a9ac51c4/weasyprint/formatting_structure/build.py#L945-L947
Using test_tables_2_rtl and tracing x and y values at:
https://github.com/Kozea/WeasyPrint/blob/e7c44e5fcdd5c7a1bf9b453c772d52c9a9ac51c4/weasyprint/draw/border.py#L655
we will get the following drawing order:
x=15.5 y=2.0 w=5.0 h=10.5
x=10.0 y=2.0 w=5.5 h=5.5
x=15.5 y=7.5 w=10.5 h=5.0
x=10.0 y=7.5 w=5.5 h=5.0
x=20.5 y=12.5 w=5.5 h=5.5
x=15.5 y=12.5 w=5.0 h=5.5
From the above x values, cells are still being drawn from right-to-left, thus the one pixel difference problem mentioned in https://github.com/Kozea/WeasyPrint/pull/2056#issuecomment-1941522665 still persists.
Although I am aware this goes against the objective of drawing parts being independent from direction, if it's any help, I would like to point out that to overcome this problem previously, I had to reversed the cells in drawing parts:
for row in row_group.children:
draw_background(stream, row.background)
if table.style['direction'] == 'rtl':
cells = reversed(row.children)
else:
cells = row.children
for cell in cells:
From the above x values, cells are still being drawn from right-to-left, thus the one pixel difference problem mentioned in #2056 (comment) still persists.
As explained above, trying to solve rasterization problems is often out of WeasyPrint’s scope. Using Ghostscript or Pango give different results, and even with the same tool, antialiasing options give very different renderings.
Here’s what I get with Poppler, with default options:
And without vector antialiasing:
The bug you mention doesn’t appear. Maybe it’s not a "bug" in WeasyPrint, but a bug in Ghostscript?
With antialiasing:
Without:
That’s the problem you’re talking about. It appears with the ltr table, so it’s not related to rtl or bidi, and not actually related to this PR.
You can find many other problems in the other renderings, depending on the raster resolution, the position of the table, the rounding of floats, etc. Even if it’s possible to improve the rendering in some cases, for some tools, for some output resolutions, it’s logically impossible to solve them all. WeasyPrint only generates vector PDF, and that’s already really complicated to get it right :). We don’t want to think about rasterization problems, we want to avoid workarounds and extra code for that, because we know that it’s a rabbit hole we can’t afford to fall into, even a little bit.
Actually it's not so much the rasterization "bug" I wish to highlight, but the claim that:
The drawing step always gets left-to-right structures for borders and column widths/positions
I see this more as a side effect due to different drawing direction rather than a "bug". So what I am trying to highlight is that the drawing step doesn't strictly "always gets left-to-right structures for borders and column widths/positions" when table is rtl.
For comparison, the collapsed cell borders in the same test case don't seem to have this side effect, probably due to the border segments being sorted first at:
https://github.com/Kozea/WeasyPrint/blob/e7c44e5fcdd5c7a1bf9b453c772d52c9a9ac51c4/weasyprint/draw/init.py#L457-L458
For this particular test case (not sure if it applies to other more complicated cases), the sorting step causes the cell borders to be drawn in the same order - whether ltr or rtl table.
WeasyPrint only generates vector PDF, and that’s already really complicated to get it right :). We don’t want to think about rasterization problems
Indeed, rasterization is not really the problem I would like to see solve because the vector coordinates are already "perfect".
But if the objective is to always draw left-to-right regardless of the direction (which I see as one way to ease the complication), then I wish to highlight that it's not strictly being met because cells in rtl table are actually being drawn from right-to-left.
Note: I used the rasterization "bug" and "solution" merely to show that by always rendering left-to-right may simplify something already really complicated to get right
But if the objective is to always draw left-to-right regardless of the
direction(which I see as one way to ease the complication), then I wish to highlight that it's not strictly being met because cells in rtl table are actually being drawn from right-to-left.
I wasn’t clear enough, sorry. The goal was to draw from left to right for collapsed borders, not cells. It means that the drawing algorithm doesn’t care about direction anymore (and that’s good news). A better solution would have been to store borders coordinates instead of column directions / positions, but let’s keep this for another version :).
Cells are drawn in the same order as in HTML, just as any other blocks, and that’s what we should do.
The whole drawing algorithm code is now direction agnostic (see the changes in draw/__init__.py), there’s no direction checked in the whole draw module, and that’s a nice achievement. It’s also almost true for formatting_structure/build.py, except for this terrible workaround for markers, and that’s a good thing too. Keeping the direction handling in the layout step is simpler to maintain from my point of view.
So what I am trying to highlight is that the drawing step doesn't strictly "always gets left-to-right structures for borders and column widths/positions" when table is rtl.
I was talking about this:
https://github.com/Kozea/WeasyPrint/blob/e7c44e5fcdd5c7a1bf9b453c772d52c9a9ac51c4/weasyprint/layout/table.py#L593-L596
https://github.com/Kozea/WeasyPrint/blob/e7c44e5fcdd5c7a1bf9b453c772d52c9a9ac51c4/weasyprint/layout/table.py#L966-L980
The borders, column widths and column positions are now always stored left-to-right after the layout.
Note: I used the rasterization "bug" and "solution" merely to show that by always rendering left-to-right may simplify something already really complicated to get right
👍🏽