pandoc-csv2table icon indicating copy to clipboard operation
pandoc-csv2table copied to clipboard

[Bug] `type="pipe"` does not work at all; `type="multiline"` does not honor `aligns="L"`

Open pdfkungfoo opened this issue 10 years ago • 8 comments

Consider the following Markdown:

$ cat csvtables.md | gsed "s#^#    #"


# Simple Tables

```` {.table caption="Does this result in a `simple_table`?" type="simple" aligns="RCRLCRC" header="yes"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

```` {.table caption="Does this result in a `simple_table`?" type="simple" aligns="RCRLCRC" header="no"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

# Grid Tables

```` {.table caption="Does this result in a `grid_table`?" type="grid" aligns="RCRLCRC" header="yes"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

```` {.table caption="Does this result in a `grid_table`?" type="grid" aligns="RCRLCRC" header="no"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

# Pipe Tables

```` {.table caption="Does this result in a `pipe_table`?" type="pipe" aligns="RCRLCRC" header="yes"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

```` {.table caption="Does this result in a `pipe_table`?" type="pipe" aligns="RCRLCRC" header="no"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

# Multiline Tables

```` {.table caption="Does this result in a `multiline_table`?" type="multiline" aligns="RCRLCRC" header="yes"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

```` {.table caption="Does this result in a `multiline_table`?" type="multiline" aligns="RCRLCRC" header="no"}
**Fruit(R)**, *Quantity(C)*, ***Price(R)***, **`Origin`(L)**, `Quality`(C), packed(R), sold-out?(C)
apples, 15,"3,24", Spain, excellent, ***yes***, yes 
oranges, 12,"2,22", Germany, **sour**, no, soon 
````

Running this Pandoc command: pandoc -f markdown --filter=pandoc-csv2table -t markdown csvtables.md results in the following output:

Simple Tables
=============

    **Fruit(R)**  *Quantity(C)*    ***Price(R)***  **`Origin`(L)**   `Quality`(C)    packed(R)  sold-out?(C)
  -------------- --------------- ---------------- ----------------- -------------- ----------- --------------
          apples       15                    3,24       Spain         excellent      ***yes***      yes
         oranges       12                    2,22      Germany         **sour**             no      soon

  : Does this result in a `simple_table`?

  -------------- --------------- ---------------- ----------------- -------------- ----------- --------------
    **Fruit(R)**  *Quantity(C)*    ***Price(R)***  **`Origin`(L)**   `Quality`(C)    packed(R)  sold-out?(C)
          apples       15                    3,24       Spain         excellent      ***yes***      yes
         oranges       12                    2,22      Germany         **sour**             no      soon
  -------------- --------------- ---------------- ----------------- -------------- ----------- --------------

  : Does this result in a `simple_table`?

Grid Tables
===========

+-----------+------------+-------------+-------------+------------+----------+------------+
| **Fruit(R | *Quantity( | ***Price(R) | **`Origin`( | `Quality`( | packed(R | sold-out?( |
| )**       | C)*        | ***         | L)**        | C)         | )        | C)         |
+===========+============+=============+=============+============+==========+============+
| apples    | 15         | 3,24        | Spain       | excellent  | ***yes** | yes        |
|           |            |             |             |            | *        |            |
+-----------+------------+-------------+-------------+------------+----------+------------+
| oranges   | 12         | 2,22        | Germany     | **sour**   | no       | soon       |
+-----------+------------+-------------+-------------+------------+----------+------------+

: Does this result in a `grid_table`?

+-----------+------------+-------------+-------------+------------+----------+------------+
| **Fruit(R | *Quantity( | ***Price(R) | **`Origin`( | `Quality`( | packed(R | sold-out?( |
| )**       | C)*        | ***         | L)**        | C)         | )        | C)         |
+-----------+------------+-------------+-------------+------------+----------+------------+
| apples    | 15         | 3,24        | Spain       | excellent  | ***yes** | yes        |
|           |            |             |             |            | *        |            |
+-----------+------------+-------------+-------------+------------+----------+------------+
| oranges   | 12         | 2,22        | Germany     | **sour**   | no       | soon       |
+-----------+------------+-------------+-------------+------------+----------+------------+

: Does this result in a `grid_table`?

Pipe Tables
===========

    **Fruit(R)**  *Quantity(C)*    ***Price(R)*** **`Origin`(L)**    `Quality`(C)    packed(R)  sold-out?(C)
  -------------- --------------- ---------------- ----------------- -------------- ----------- --------------
          apples       15                    3,24 Spain               excellent      ***yes***      yes
         oranges       12                    2,22 Germany              **sour**             no      soon

  : Does this result in a `pipe_table`?

|---------------:|:----------------:|------------------:|:-------------------|:---------------:|-------------:|:---------------:|
| **Fruit(R)** | *Quantity(C)* | ***Price(R)*** | **`Origin`(L)** |
`Quality`(C) | packed(R) | sold-out?(C) | | apples | 15 | 3,24 | Spain |
excellent | ***yes*** | yes | | oranges | 12 | 2,22 | Germany | **sour**
| no | soon |

Table: Does this result in a `pipe_table`?

Multiline Tables
================

  ---------------------------------------------------------------------------
  **Fruit(R *Quantity( ***Price(R) **`Origin`(L `Quality`( packed(R sold-out?
        )**    C)*             ***     )**          C)            )    (C)
  --------- ---------- ----------- ------------ ---------- -------- ---------
     apples     15            3,24    Spain     excellent  ***yes**    yes
                                                                  * 

    oranges     12            2,22   Germany     **sour**        no   soon
  ---------------------------------------------------------------------------

  : Does this result in a `multiline_table`?

  --------- ---------- ----------- ------------ ---------- -------- ---------
  **Fruit(R *Quantity( ***Price(R) **`Origin`(L `Quality`( packed(R sold-out?
        )**    C)*             ***     )**          C)            )    (C)

     apples     15            3,24    Spain     excellent  ***yes**    yes
                                                                  * 

    oranges     12            2,22   Germany     **sour**        no   soon
  --------- ---------- ----------- ------------ ---------- -------- ---------

  : Does this result in a `multiline_table`?

Running this Pandoc command: pandoc -f markdown --filter=pandoc-csv2table -o csvtables.pdf csvtables.md -V geometry:"margin=0.5cm, paperwidth=595pt, paperheight=35cm" results in this PDF (screenshot):

PDF output

So for the case of type="pipe" two things do not work:

  • if there is a header="yes" a table is generated. However, it is not a pipe_table, but a simple_table
  • if there is a header="no", the table looks like there was an attempt to indeed create a pipe_table, but the output is b0rken.

For the case of type="multiline" one thing doesn't work as expected:

  • column no. 4 ("Origin(L)") is not left-aligned, but centered.

BTW, if I replace my above ````` {.table ....} fences by ````{.table ....} fences (3 backticks, no blank before the curly brace), then the multiline_table output is missing the blank lines in between table rows. Though I'm currently not sure if that syntax violates the spec or if it is "legal"...


Observations about the line lengths...

Also, the grid_table output looks a bit funny with the line-breaks within all the header and some of the table body cells. It seems to be "legal", though.

But this is not required, since the longest line of the Markdown output {the b0rken pipe_table separator line (for header="no") with the colons} is using 130 characters, while the grid_table is using only 92.

Adding --columns=110 to the Pandoc command line to produce Markdown tables, results in...

  • ...the following (better) grid_table output, which does not apply line-breaks within cells;
  • ...the following (worse) multiline_table output, which does no longer contain the blank lines in between table rows. (However, this may mean that these blank lines are only really required _IF_ indeed there is a "multiline" used for any one cell. Since the 110 column width for the Markdown output does grant enough space for all cell contents of the given table to fit 1 line, it may be superfluous...)
Grid Tables
===========

+----------------+------------------+-------------------+--------------------+-----------------+--------------+-----------------+
| **Fruit(R)**   | *Quantity(C)*    | ***Price(R)***    | **`Origin`(L)**    | `Quality`(C)    | packed(R)    | sold-out?(C)    |
+================+==================+===================+====================+=================+==============+=================+
| apples         | 15               | 3,24              | Spain              | excellent       | ***yes***    | yes             |
+----------------+------------------+-------------------+--------------------+-----------------+--------------+-----------------+
| oranges        | 12               | 2,22              | Germany            | **sour**        | no           | soon            |
+----------------+------------------+-------------------+--------------------+-----------------+--------------+-----------------+

: Does this result in a `grid_table`?

+----------------+------------------+-------------------+--------------------+-----------------+--------------+-----------------+
| **Fruit(R)**   | *Quantity(C)*    | ***Price(R)***    | **`Origin`(L)**    | `Quality`(C)    | packed(R)    | sold-out?(C)    |
+----------------+------------------+-------------------+--------------------+-----------------+--------------+-----------------+
| apples         | 15               | 3,24              | Spain              | excellent       | ***yes***    | yes             |
+----------------+------------------+-------------------+--------------------+-----------------+--------------+-----------------+
| oranges        | 12               | 2,22              | Germany            | **sour**        | no           | soon            |
+----------------+------------------+-------------------+--------------------+-----------------+--------------+-----------------+

: Does this result in a `grid_table`?

[....]

Multiline Tables
================

    **Fruit(R)**  *Quantity(C)*      ***Price(R)***  **`Origin`(L)**    `Quality`(C)      packed(R)  sold-out?(C)
  -------------- ---------------- ----------------- ------------------ --------------- ------------ --------------
          apples        15                     3,24       Spain           excellent       ***yes***      yes
         oranges        12                     2,22      Germany          **sour**               no      soon

  : Does this result in a `multiline_table`?

  -------------- ---------------- ----------------- ------------------ --------------- ------------ --------------
    **Fruit(R)**  *Quantity(C)*      ***Price(R)***  **`Origin`(L)**    `Quality`(C)      packed(R)  sold-out?(C)
          apples        15                     3,24       Spain           excellent       ***yes***      yes
         oranges        12                     2,22      Germany          **sour**               no      soon
  -------------- ---------------- ----------------- ------------------ --------------- ------------ --------------

  : Does this result in a `multiline_table`?

pdfkungfoo avatar Jun 16 '15 11:06 pdfkungfoo

@Baig: I hope my bug reporting (even if it is resulting from a mistake on my part :-) is not too annoying for you. I only do this because I think your csv2table filter is one of the most useful and awesome external filter contributions to Pandoc! Thanks for that work.

I would even hope that this could sometime end up inside Pandoc proper, so one could get rid of calling the extra --filter=pandoc-csv2table command line parameter. :-)

pdfkungfoo avatar Jun 16 '15 12:06 pdfkungfoo

I am totally booked for the next three days, so I won't be able to look into it until later this weekend. Sorry for the wait.

@Baig: I hope my bug reporting (even if it is resulting from a mistake on my part :-) is not too annoying for you.

Not at all.

I only do this because I think your csv2table filter is one of the most useful and awesome external filter contributions to Pandoc!

I am glad that you found it useful.

Thanks for that work.

My pleasure.

I would even hope that this could sometime end up inside Pandoc proper, so one could get rid of calling the extra --filter=pandoc-csv2table command line parameter. :-)

Maybe it will. Wait and see.

baig avatar Jun 16 '15 12:06 baig

No problem. :-)

After all, you are working on this in your spare time, and giving it away under a FOSS license :-)

pdfkungfoo avatar Jun 16 '15 15:06 pdfkungfoo

First of all, thanks for the filter! Mostly, it's working for me. However two points:

  1. Is there a reason you produce markdown instead of Pandoc native AST format? That way, we have to call pandoc twice, like pandoc --filter pandoc-csv2table file.md | pandoc Or am I doing this wrong? Just pandoc --filter pandoc-csv2table file.md inserted the markdown into an html <p> instead of generating an html <table>. There should probably be a line about how to apply the filter in the usage section of the REDME.
  2. I'm also having trouble with the generated newlines with broad tables. When I use --no-wrap in the first call to pandoc, the whole table is on one line, when I don't use --no-wrap, there are (seemingly) random newlines inserted. Both make this an invalid grid table for me (pandoc 1.15.0.5), so the only way to get it working with csv with long lines, is to hardcode the needed table width with --columns which is suboptimal to say the least. Again, this problem would be easy to side step if we could call pandoc only once, with --filter pandoc-csv2table which would then insert the table as native pandoc AST, right?

mb21 avatar Jul 11 '15 15:07 mb21

On Sat, Jul 11, 2015 at 12:36 PM, mb21 [email protected] wrote:

First of all, thanks for the filter! Mostly, it's working for me. However two points:

  1. Is there a reason you produce markdown instead of Pandoc native AST format?

The filter ultimately produces native AST, though as an intermediate step it pipes the CSV contents through Pandoc's Markdown Reader.

  1. That way, we have to call pandoc twice, like pandoc --filter pandoc-csv2table file.md | pandoc Or am I doing this wrong? Just pandoc --filter pandoc-csv2table file.md inserted the markdown into an html

    instead of generating an html

    .

    When I run the above command I get everything wrapped in a <table>. I am using Pandoc 1.5.0.5.

    1. There should probably be a line about how to apply the filter in the usage section of the REDME.

    I've updated README on how to use this filter.

    1. I'm also having trouble with the generated newlines with broad tables. When I use --no-wrap in the first call to pandoc, the whole table is on one line, when I don't use --no-wrap, there are (seemingly) random newlines inserted. Both make this an invalid grid table for me (pandoc 1.15.0.5), so the only way to get it working with csv with long lines, is to hardcode the needed table width with --columns which is suboptimal to say the least. Again, this problem would be easy to side step if we could call pandoc only once, with --filter pandoc-csv2table which would then insert the table as native pandoc AST, right?

    It does produce native AST in the end. It would help if you can quote a specific example that didn't work for you preferably as a separate ticket.

    — Reply to this email directly or view it on GitHub https://github.com/baig/pandoc-csv2table/issues/11#issuecomment-120634428 .

baig avatar Jul 11 '15 16:07 baig

@baig thanks for the quick reply. Now I see how the filter is properly used, I could isolate my complaints/misunderstanding to a bug: see #13 .

However, it is then somewhat of a mystery to me how you enforce the different table types (grid, pipe, etc.): am I guessing correctly that you just play with the [Alignment] [Double] properties in the pandoc AST to force the markdown writer to choose the right table syntax?

mb21 avatar Jul 11 '15 18:07 mb21

@KurtPfeifle Sorry for the delay in addressing this ticket. I am finalizing my dissertation and can't spare much time right now. However, this issues and other enhancements proposed are on my radar.

baig avatar Jul 13 '15 19:07 baig

@baig: No problem, take your time. Meanwhile, I wish you all the best for your dissertation efforts :-)

pdfkungfoo avatar Jul 13 '15 20:07 pdfkungfoo