djot icon indicating copy to clipboard operation
djot copied to clipboard

Simpler table markup?

Open dereuromark opened this issue 1 month ago • 10 comments

Currently tables always require extra header row:

| Header 1 | Header 2 |
|----------|----------|
| Cell 1   | Cell 2   |

otherwise the header would not be rendered as such

What if this was simpler - more intuitive?

Simple Tables

|= Name     |= Age |= City     |
| Alice     | 28   | New York  |
| Bob       | 34   | London    |

|= marks header cells (from Creole). No separator row needed.

Alignment

|= Name     |= Age |= City      |
| Alice     |   28 | New York   |
| Bob       |   34 | London     |

Alignment is inferred from whitespace:

  • More space on left = right-aligned
  • More space on right = left-aligned
  • Equal space = centered

Explicit alignment:

|=< Name  |=> Age |=~ City   |
| Left    | Right | Center   |

Column Spans

|= Name     |= Contact Info          ||
| Alice     | [email protected] | 555-1234 |

Empty cells at row end merge with previous.

Multi-line Cells

|= Feature |= Description                      |
| Complex  | This cell continues \             |
|          | on the next line.                 |
| Simple   | This is a single line.            |

Backslash at end continues cell. Leading | | continues previous cell.

Headerless Tables

| Cell | Cell |
| Cell | Cell |

No special syntax needed - absence of |= means no headers.

Questions

  • Is there any issue with this kind of approach?
  • Is this possible to be supported as shorthand variation besides current one for BC?

dereuromark avatar Nov 29 '25 23:11 dereuromark

Empty cells at row end merge with previous.

For the record, I do find interesting the approach of Typst's 3rd-party package tablem for column and row merging (with explicit |< etc.) over "empty cells" (which might be legit in some context).

Omikhleia avatar Nov 29 '25 23:11 Omikhleia

Note that "empty" means literally empty (not even a space). So as soon as you use a single space, it stays a real "empty cell" as before.

That said: The only interesting part for me right now would be the saving the extra row using =. The alignment is sugar, especially if we allow {.class} to be used also per table row/column, which I think another issue/PR wanted to address, and would complete full attribute support once the open PR for list items ( https://github.com/jgm/djot/pull/262 ) is merged.

dereuromark avatar Nov 30 '25 01:11 dereuromark

The alignment is sugar, especially if we allow {.class}

Nit: Not all renderers are HTML + CSS enabled, and a class name is not the most obvious semantic way to mark alignment.

Omikhleia avatar Nov 30 '25 01:11 Omikhleia

I went ahead and implemented the basics in https://github.com/php-collective/djot-php/pull/8 for PHP. It shows that it can be added with full BC of old way.

I dont want to go ahead without @jgm approving it though at this point, as I think it is important to be in sync with the main feature scope.

dereuromark avatar Nov 30 '25 02:11 dereuromark

Forgot to mention:

Backwards Compatibility

The traditional separator row syntax continues to work unchanged. If both are defined, the header would only align itself, and the meta row would define for the cells below:

|= Name  |=< Age |                    // none + left
|:------|----:|                       // left + right
| Alice | 28  |                       // left + right

@jgm Whats your take on this? Would the proposed syntax make sense to simplify djot here? To be in sync with the rule of one clear way: Would we be able to deprecate the old verbose style in favor of the shorthand?

dereuromark avatar Dec 03 '25 00:12 dereuromark

Currently tables always require extra header row:

While not precisely defined nor supported by the playground, the existing syntax could allow multi-line headers (= everything before the separator row)... If some syntax (as in the tablem example mentioned above) makes it for column spanning/merging at least, this would make even more sense. I'd be interested in such an option, rather than a redefinition of an alternate syntax that doesn't seem (at a glance) it would allow it easily...

Omikhleia avatar Dec 03 '25 01:12 Omikhleia

While not precisely defined nor supported by the playground, the existing syntax could allow multi-line headers (= everything before the separator row)...

Sure, that wouldnt collide. Those are mainly just rendered as th instead of td ;)

If some syntax (as in the tablem example mentioned above) makes it for column spanning/merging at least, this would make even more sense. I'd be interested in such an option

Those would be another feature/topic that can be discussed separately IMO, that does not conflict with the improved syntax of removing the meta row (that is quite annoying to type manually).

dereuromark avatar Dec 03 '25 01:12 dereuromark

It's interesting. Easier-to-type isn't a desideratum here, and it loses points for looking a bit less like a table. But in some ways it is simpler not to have to deal with the separator row.

Colspans and rowspans are important, but that's a separate issue from the issue about marking up the header and indicating column alignments.

jgm avatar Dec 03 '25 08:12 jgm

Maybe not desideratum, but following your own words:

What if we tried to create a light markup syntax that keeps what is good about Markdown, while revising some of the features

Making this simpler for every day users is a worthy goal. The descriptiveness now meets simplicity in setup/writing.

Right, the col/row span seems to be complimentary indeed. We cannot use the tablem here, but the proposed syntax above works fine:

Colspan (header spans multiple data columns):

  |= Name |= Price ||
  | Apples | Regular | $1.50 |
  | Oranges | Sale | $1.00 |

→ "Price" header spans over "Regular/$1.50" and "Sale/$1.00" columns

Rowspan (cell spans multiple rows):

  |= Category |= Item |= Price |
  | Fruits | Apple | $1.50 |
  |^ | Banana | $0.50 |
  |^ | Cherry | $2.00 |

→ "Fruits" spans 3 rows

Should be combinable with header as well.

  Complete Marker Reference

  | Marker | Meaning                      | Example         |
  |--------|------------------------------|-----------------|
  | |=     | Header cell                  | |= Name |       |
  | |=<    | Left-aligned header          | |=< Name |      |
  | |=>    | Right-aligned header         | |=> Price |     |
  | |=~    | Center-aligned header        | |=~ Title |     |
  | |=^    | Header rowspan (from above)  | |=^ |           |
  | ||     | Colspan (empty = merge left) | |= Spans two || |
  | |^     | Data rowspan (from above)    | |^ |            |

  Multi-Row Header Example

  |=~ Company Report 2024 |||||
  |= Region |= Q1 ||= Q2 ||
  |=^ |= Units |= Rev |= Units |= Rev |
  | North | 100 | $500 | 150 | $750 |
  | South | 80 | $400 | 90 | $450 |

  ┌───────────────────────────────── ──┐
  │        Company Report 2024         │
  ├────────┬─────────────┬─────────── ─┤
  │ Region │     Q1      │     Q2      │
  │        ├───────┬─────┼───────┬─────┤
  │        │ Units │ Rev │ Units │ Rev │
  ├────────┼───────┼─────┼───────┼─────┤
  │ North  │  100  │$500 │  150  │$750 │
  │ South  │   80  │$400 │   90  │$450 │
  └────────┴───────┴─────┴───────┴─────┘

Or Header Rowspan=3 (complex):

  |= Year |= 2024 |||= 2025 |||
  |=^ |= Q1 |= Q2 |= Q3 |= Q1 |= Q2 |= Q3 |
  |=^ |= Jan-Mar |= Apr-Jun |= Jul-Sep |= Jan-Mar |= Apr-Jun |= Jul-Sep |
  | Sales | 100 | 150 | 200 | 120 | 180 | 220 |

  ┌───────┬─────────────────────────┬─────────────────────────┐
  │       │          2024           │          2025           │
  │       ├─────────┬───────┬───────┼─────────┬───────┬───────┤
  │ Year  │   Q1    │  Q2   │  Q3   │   Q1    │  Q2   │  Q3   │
  │       ├─────────┼───────┼───────┼─────────┼───────┼───────┤
  │       │ Jan-Mar │Apr-Jun│Jul-Sep│ Jan-Mar │Apr-Jun│Jul-Sep│
  ├───────┼─────────┼───────┼───────┼─────────┼───────┼───────┤
  │ Sales │   100   │  150  │  200  │   120   │  180  │  220  │
  └───────┴─────────┴───────┴───────┴─────────┴───────┴───────┘

If we can align with this, I could whip up the POC to this in my sandbox.

dereuromark avatar Dec 03 '25 13:12 dereuromark

Let's compare:

New |= syntax:

  |= Year |= 2024 |||= 2025 |||
  |=^ |= Q1 |= Q2 |= Q3 |= Q1 |= Q2 |= Q3 |
  |=^ |= Jan-Mar |= Apr-Jun |= Jul-Sep |= Jan-Mar |= Apr-Jun |= Jul-Sep |
  | Sales | 100 | 150 | 200 | 120 | 180 | 220 |

Traditional separator row syntax:

  | Year | 2024 ||| 2025 |||
  |^ | Q1 | Q2 | Q3 | Q1 | Q2 | Q3 |
  |^ | Jan-Mar | Apr-Jun | Jul-Sep | Jan-Mar | Apr-Jun | Jul-Sep |
  |------|---------|--------|--------|---------|--------|--------|
  | Sales | 100 | 150 | 200 | 120 | 180 | 220 |

Separator seems to be a bit cleaner:

  1. No need to repeat |= on every header cell
  2. Everything above separator = header, below = data
  3. Less visual noise

But separator row has downsides:

  1. Can't have header cells in data rows (e.g., row headers)
  2. Separator must match column count (tricky with colspan)
  3. Only one header section possible

Hybrid approach might be best in some multi-header-row cases:

  • Separator row defines "all cells above are headers"
  • |= used only for header cells in data rows (row headers)
  • Spanning works with both
  | Year | 2024 ||| 2025 |||
  |^ | Q1 | Q2 | Q3 | Q1 | Q2 | Q3 |
  |^ | Jan-Mar | Apr-Jun | Jul-Sep | Jan-Mar | Apr-Jun | Jul-Sep |
  |------|---------|--------|--------|---------|--------|--------|
  |= Widgets | 100 | 150 | 200 | 120 | 180 | 220 |
  |= Gadgets | 80 | 90 | 100 | 110 | 120 | 130 |

→ "Widgets" and "Gadgets" become <th> row headers, rest are <td>.

What do you think - should both approaches coexist?

dereuromark avatar Dec 03 '25 13:12 dereuromark

I finished my Proof of Concept PR https://github.com/php-collective/djot-php/pull/8 Check it out.

I think if this was approved, this would make table usage intuitive, simple yet powerful. cc @jgm

I left out the multiline approach for now, since this is another dimension of complexity probably. Separate RFC: https://github.com/php-collective/djot-php/issues/26

dereuromark avatar Dec 05 '25 23:12 dereuromark