WeasyPrint icon indicating copy to clipboard operation
WeasyPrint copied to clipboard

Table not printed as shown in browser: weasyprint overrides "table width" with "cell width"

Open andul opened this issue 1 year ago • 9 comments

If the table (table) has a maximum width and the individual columns (td) have their own width - greater than the table width(!) - weasyprint interprets the width different as browsers will do:

A Table with "table width 500px" and "td width 3000px" will shown as follows:

  • Browsers take the table-width as the relevant width -> Table will be 500px wide
  • Weasyprint takes the td-width as relevant width -> Table will be 3000px wide (table will be cut off!)

table-bug

Example code

<!DOCTYPE html><html class="view  chrome"  lang="de">
<p> Table Width: 500px </p>
<div>    
<div style="overflow:auto">      
<div class="text-content">            
<table border="1" cellpadding="1" cellspacing="1" style="width:500px">					
<tbody>			
<tr>				
<td style="width:3000px">Width: 3000px</td>				
<td style="width:609px">2</td>			
</tr>			
<tr>				
<td style="width:927px">Width: 927px</td>				
<td style="width:609px">2</td>			
</tr>			
<tr>				
<td style="width:927px">Width: 927px</td>				
<td style="width:609px">2</td>			
</tr>		
</tbody>	     
</table><p>&nbsp;</p>              	
</div>    
</div>  
</div>
</html> 

andul avatar Sep 01 '23 12:09 andul

Hi!

I can see that Chrome and Firefox give this result, but the CSS2 specification seems to tell something different:

If the 'table' or 'inline-table' element's 'width' property has a computed value (W) other than 'auto', the used width is the greater of W, CAPMIN, and the minimum width required by all the columns plus cell spacing or borders[…].

Here the "minimum width required by columns" is:

For each column, determine a maximum and minimum column width from the cells that span only that column. The minimum is that required by the cell with the largest minimum cell width (or the column 'width', whichever is larger). […]

And minimum cell width is defined aby:

Calculate the minimum content width (MCW) of each cell: the formatted content may span any number of lines but may not overflow the cell box. If the specified 'width' (W) of the cell is greater than MCW, W is the minimum cell width. […]

Here, the first cell’s min width is 3000px (plus borders), so the first column’s min width is 3000px (plus borders), so the table’s width should be 3000px (plus borders plus second column).

Even in this draft, we can see that:

The used min-width of a table is the greater of the resolved min-width, CAPMIN, and GRIDMIN.

Note that the layout of tables is known to be loosely specified and to give different results depending on the browsers, mainly because browsers were using different algorithms before the specification was written (and nobody knew how it was working in both Netscape and Internet Explorer). Read: nobody cares :/.

So… Is this a "bug" in all browsers? That’s really rare when I think this (because WeasyPrint is often wrong), but far as I can tell, right now it is.

If someone finds anything about this in the specification, I’ll be happy to fix the bug. But until then, we can assume that there’s nothing to fix (in WeasyPrint :smile:).

liZe avatar Sep 10 '23 21:09 liZe

Hi @liZe , thanks for taking the time to evaluate this issue! You are right, from the perspective of the specification the behavior of WeasyPrint is right - and all browser doing it wrong. But they doing it all wrong in a consistent way which overrules the specification somehow. To be realistic, browsers will never fix this only because it is written in the specification, as this will change existing contents and layouts from the last decades. The question rather is, what kind of HTML print engine WeasyPrint is?

  1. WeasyPrint does it "right in the specification way of having it right"?
  2. WeasyPrint does it "right in the real world usage way of doing it right"?

If WeasyPrint claims to print HTML the way users see it in their browsers, WeasyPrint should definitely orientate towards browser interpretation, even if this implies to not follow the official specification, as they overrule in a consistent way how content is created and displayed.

andul avatar Sep 11 '23 14:09 andul

Hi!

The question rather is, what kind of HTML print engine WeasyPrint is?

This question has been answered many times in this bug tracker, and unfortunately for you the answer is 1.

WeasyPrint’s team is not Google/Apple/Mozilla and blindly follows the specification, because each time we didn’t it was a nightmare afterwards. Following the specification is hard. Following what other browsers do and don’t document is much harder, it doesn’t look like a good idea for me, and in the long term it’s the first issue of a long list of related issues.

And of course, the first one will be "My document was rendered correctly and is now broken, could you please restore the previous behaviour?" 😄

When browsers, created and maintained by multimillionaire companies, don’t follow a specification, I think that it’s fair to ask them to fix the "bug" more than to ask a two-people company to reverse-engineer what others do wrongly. The other solution is to ask the W3C to change their specification draft, but I think that it won’t happen: David Baron is probably the only one on Earth who cares about that, and in my opinion he knows much better than browsers themselves how tables should be rendered 😁.

But to be honest, I’m surprised to see that browsers all do the same, and I may have missed something in the draft. If it’s the case, I’ll be happy to fix the bug.

If WeasyPrint claims to print HTML the way users see it in their browsers […]

Did it ever claim that? 😄

We’re not smart enough to be smarter than the specification, here’s what we always claim!

liZe avatar Sep 11 '23 15:09 liZe

@andul Let’s find a solution to your problem instead of discussing about the specification :smile:.

Why did you have these widths set on the table and on their cells, and is there a way to change them / remove them? With some context, we can probably find a solution that works well for you without changing WeasyPrint’s algorithm.

liZe avatar Sep 11 '23 15:09 liZe

@liZe thanks for your response and explanation - I understand your point.

The reason why these widths are the way they are: Users doing resizing with CKEditor 4 and it will mess up those table cell properties. You will not realize this before, as all browsers will show it correctly until you print it with WeasyPrint...

Don´t know if all browsers do it wrong or if the specification is like they implement it. Unfortunately we cannot fix the issue in CKEditor.

andul avatar Sep 29 '23 05:09 andul

@andul The more I think about this problem, the more I feel that there’s something wrong in WeasyPrint: when we get the same result with different browsers, and when this result is different in WeasyPrint, it generally means that there’s something wrong.

There may be a reason why we get this result: in CSS 2, as explained above, the minimum width of the cell follows the width attribute if it’s greater than the minimum content width. But in level 3, it doesn’t depend on width:

The outer min-content width of a table-cell is max(min-width, min-content width) adjusted by the cell intrinsic offsets.

Browsers don’t follow what’s in CSS 2 for tables, but they are often closer to what’s in level 3. I can try to ignore the width attribute to calculate the min width of the cell, as proposed in level 3, and see if it gives the same result as in browsers.

liZe avatar Oct 02 '23 09:10 liZe

Hi @andul and @liZe we are having the same problem with similar scenario - html coming from tinymce editor. Did you come up with some workaround?

minorum avatar Nov 23 '23 10:11 minorum

@minorum I don’t think that there’s a simple workaround for now.

liZe avatar Dec 17 '23 20:12 liZe

I would like to make the argument that WeasyPrint currently more closely follows CSS 2.1, whereas Firefox and Chrome implement the newer CSS Table Module Level 3. A clue lies in one of the changes that is relevant to this issue: CSS 2.1 does not consider min-width/max-width in the context of tables and essentially treats width as a minimum width. CSS Tables 3 changes that.

Let's take a look at the spec (https://drafts.csswg.org/css-tables-3/#computing-the-table-width). In our case the resolved table width is not auto, leading to:

The used min-width of a table is the greater of the resolved min-width, CAPMIN, and GRIDMIN.

This is what @liZe also quoted above. In our example we can omit resolved min-width (our table doesn't have a min-width specified) and CAPMIN, that concerns itself with captions (which our example doesn't have).

What's GRIDMIN then?

The row/column-grid width minimum (GRIDMIN) width is the sum of the min-content width of all the columns plus cell spacing or borders.

Leaving spacing and borders aside for now, this means that our table width is essentially max(resolved table width, sum of min-content width of the columns). Resolved table width in our case corresponds to the specified table width at 500px. min-content width of a column is specified here: https://drafts.csswg.org/css-tables-3/#computing-column-measures. In the basic case (based on cells with span 1) it roughly boils down to:

The largest of:

  • the width specified for the column:
    • the outer min-content width of the table-column
    • the outer min-content width of the table-column-group
  • the outer min-content width of each cell

Where the outer min-content widths are defined in https://drafts.csswg.org/css-tables-3/#computing-cell-measures.

Now this is important:

  • The outer min-content width of a table-cell is max(min-width, min-content width) adjusted by the cell intrinsic offsets.
  • The outer min-content width of a table-column or table-column-group is max(min-width, width).

This means that width of a cell does not directly contribute to the column's actual width. This basically explains the browser's behavior. min-width however does, as does width on a column. Lets try this out:

<table border="1" style="width: 500px">
  <colgroup>
    <col>
    <col>
    <col style="width: 700px">
    <col style="min-width: 700px">
  </colgroup>

  <tr>                          
    <td style="width: 700px">width</td>
    <td style="min-width: 700px">min-width</td>
    <td>col width</td>
    <td>col min-width</td>
  </tr>   
</table>

Browser-rendered: Screenshot 2024-01-22 at 08-17-20 Screenshot

The first two columns behave as expected, however I expected the third one to stretch according to the col width. Not sure what's up with that; I suspect that this is actually a bug in the spec - I opened an issue on the csswg repo about it (https://github.com/w3c/csswg-drafts/issues/9829).

alexandergitter avatar Jan 22 '24 07:01 alexandergitter

Thanks a lot to everybody here 🙏🏽 and to @kygoh for the pull request!

liZe avatar Mar 06 '24 12:03 liZe