pandoc icon indicating copy to clipboard operation
pandoc copied to clipboard

HTML table parser produces wrong column alignment in some cases.

Open tarleb opened this issue 2 years ago • 0 comments

The HTML reader seems to only inspect the first row of the table body when calculating the default alignment of a column. Example:

pandoc -f html -t native
<table>
  <tr><td align="center">center</td></tr>
  <tr><td>default</td></tr>
</table>
^D
[ Table
    ( "" , [] , [] )
    (Caption Nothing [])
    [ ( AlignCenter , ColWidthDefault ) ]
    (TableHead ( "" , [] , [] ) [])
    [ TableBody
        ( "" , [] , [] )
        (RowHeadColumns 0)
        []
        [ Row
            ( "" , [] , [] )
            [ Cell
                ( "" , [] , [] )
                AlignCenter
                (RowSpan 1)
                (ColSpan 1)
                [ Plain [ Str "center" ] ]
            ]
        , Row
            ( "" , [] , [] )
            [ Cell
                ( "" , [] , [] )
                AlignDefault
                (RowSpan 1)
                (ColSpan 1)
                [ Plain [ Str "default" ] ]
            ]
        ]
    ]
    (TableFoot ( "" , [] , [] ) [])
]

The ColSpec should be ( AlignDefault , ColWidthDefault ) instead of ( AlignCenter , ColWidthDefault ); pandoc writers fall back to the column-wide alignment if the cell has AlignDefault alignment.

<table>
<tbody>
<tr class="odd">
<td style="text-align: center;">center</td>
</tr>
<tr class="even">
<td style="text-align: center;">default</td>
</tr>
</tbody>
</table>

tarleb avatar Jun 12 '22 09:06 tarleb