pandoc icon indicating copy to clipboard operation
pandoc copied to clipboard

request: syntax for centering text

Open uvtc opened this issue 12 years ago • 52 comments

Now that Pandoc supports line blocks, here's some syntax that might work nicely for left/center/right -justifying lines:

Hi.

| left-justified
| preserves leading space
| last line

|< also left-justified
|< does not preserve leading space
|< last line

|> right-justified
|> another line
|> last line

|<> centered
|<> another line
|<> last line

Bye.

Thoughts?

uvtc avatar Jan 22 '13 14:01 uvtc

Better:

| left-justified
| preserves leading space
| last line

|< also left-justified
|< ignores any leading space
|< last line

|>                             right-justified
|>                       ignores leading space
|>                                   last line

|<>               centered
|<>       also ignores leading space
|<>               last line

uvtc avatar Jan 22 '13 16:01 uvtc

Reasons why I like this syntax:

  • consistent with how Pandoc already marks blocks in general (such as blockquote and indented code blocks: by using a prefix on each line)
  • nicely fits in with and extends how Pandoc now marks lineblocks
  • consistent with how line blocks are handled (all line block forms would not wrap lines)
  • shouldn't break any existing docs, since line blocks were only just added
  • looks like what it does, looks good, easy to remember
  • syntax is unlikely appear accidentally where not intended

uvtc avatar Jan 23 '13 16:01 uvtc

  • by ignoring leading spaces, allows writers to make centered and right-aligned lines look in source format what they will look like in the output format

uvtc avatar Jan 24 '13 16:01 uvtc

Background

Intermixing HTML tags (or TeX macros) with Markdown to center objects (text, images, tables) is a format-dependent solution to the centering text problem. It should be possible to write a single source file in Markdown and generate all formats that publishers require and Pandoc can export, without resorting to format-specific markup. The publishing industry, academics, technical writers, poets, and bloggers use centering extensively. To wit, centered text alignment dates back 500 years, if not earlier.

Centering is not an arbitrary formatting feature, but integral to professionally typeset manuscripts. Elements centered in a proper manuscript format include:

  • Story Title
  • Author byline
  • Word count
  • Scene separators (i.e., # or * * *)
  • Story ending (i.e., THE END)

Additionally, screenplay cover pages are entirely centered.

Markdown should standardize text justification (left, right, center) before competing, and perhaps conflicting, syntaxes flourish.

Precedent

Centered text exists in various Markdown flavours:

  • http://www.pell.portland.or.us/~orc/Code/discount/
  • https://simplpost.com/markdown.html
  • http://tedwise.com/markdown/
  • http://stackoverflow.com/a/15370539/59087

And is actively employed by various web sites:

  • http://cl.ly/NpvK
  • https://help.madmimi.com/how-do-i-center-or-justify-my-text/
  • https://ca.godaddy.com/help/centering-or-justifying-text-15824

Syntax

This section describes several syntax mechanisms for text justification.

Justified Pipes

This justified pipe syntax builds on Pandoc's existing block syntax.

|<> Centered
|< Left-justified
|> Right-justified

Mirrored Arrows

The table colon syntax can be applied to enhance the arrows with left/right justification:

-> Centered <-
->: Left Justified <-
-> Right Justified :<-

This is compatible with countless existing documents that already use centering.

Mirrored Angle Brackets

A simpler syntax to the mirrored arrows is used by screenplay writing software:

> Centered <
>: Left Justified <
> Right Justified :<

Arrow Block

A single token to denote centering:

-> Centered

Double-colon Prefix Block

A regular block of text is centered using a prefix:

center::
Centered

Ini-style Prefix Block

A combination of Windows .ini file syntax and block environments:

[center]
| A paragraph that needs to be centered
| This is similar to blockquote, but with `|` as the
| starting character rather than `>`. The name of the block
| is at the beginning of the block.

This could be extended:

[align:center]
| This block is centered.
[align:left]
| This block is left-justified.
[align:right]
| This block is right-justified.

Cat Whiskers

A single token to denote centering:

>< Centered

Attribute Block

Use CSS-style attributes:

==== {.center #center_demo}
Centered
====

Arguments Against

This section describes the arguments against including a centered Markdown syntax.

Non-Descriptive

| Centering is a presentational instruction, and Markdown should separate semantics from presentation

This is the strongest argument against its inclusion, but still falls a bit short. Consider:

| Tables   |      Are      |  Cool |
|----------|:-------------:|------:|
| col 1 is |  left-aligned | $1600 |
| col 2 is |    centered   |   $12 |
| col 3 is | right-aligned |    $1 |

The Markdown strongly suggests the content is tabularized data and should be presented as such. How the table looks (cell padding, spacing, background colours, bold headers, borders, etc.) is definitely outside the scope of Markdown. However, everyone who types up such ASCII tables is aware that a single comma-delimited paragraph will not be generated (illegible to humans and impossible to machine read, generically). Some tables can be presented in other forms, such as pie charts, but those are likely exceptions.

On a similar token, _ ... _ suggests underscores, / ... / strongly suggests italics, and ** ... ** often ends up bold, all of which are indisputably presentation. Additionally, # suggests a header, but how it finally appears is presentation (e.g., font size, face, weight, and colour; page breaks; underlines; numbering; and, yes, alignment).

No semantic meaning is attached to / ... / and ** ... **; whereas, # and ## are meaningful, semantically. Therefore we cannot argue against including centered text because it lacks semantic meaning. Semantic meanings are inapplicable to centering because it applies to many unrelated text fragments: story titles, author names, pen names, publisher names, by lines, copyright notices, warnings, poetry, word counts, endings, etc.

To say that Markdown completely separates content from presentation is slightly disingenuous because source code blocks and monospaced fonts are practically inseparable.

Adding a hint to center text can be considered a separation of content from presentation. Text marked down as centered doesn't have to be presented as centered any more than a code block must be presented in a monospace font or _ ... _ must be rendered using underscores. Centered text could be presented as a marquee, blinking (please no), underlined, embossed, offset in a box, or any number of ways that are applicable to text marked with ** ... **; it could even be centered on the page, which, like tables being rendered as tables, would be the most likely outcome.

Feature Complete

Markdown, by its very nature, can never be feature complete.

By this reasoning, lambda expressions should not be added to existing programming languages. Nobody has proposed hard limits on what Markdown is meant to accomplish. Markdown has a stated purpose: to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid X/HTML. Yet, as Pandoc has shown, Markdown can also be a simple, human-readable, filetype-agnostic structured text format.

Slippery Slope

Agree that centering is useful, but completely disagree that it belongs in Markdown. It's a slippery slope once you open the door to any formatting features. This is why Markdown allows HTML tags.

First, the slippery slope is a logical fallacy. Second, mixing HTML tags and Markdown eliminates the possibility of a single, pure Markdown source to automatically generate a wide variety of output formats.

Arguments For

This section describes the arguments for including a centered Markdown syntax.

Precedent

A number of precedents currently exist in the wild for a centering syntax. While this is argumentum ad populum, it certainly speaks to a practical need in the real world.

Competing Syntax

Developing a standard in Markdown itself should help prevent competing standards from emerging.

Historic Typography

Centering has been in use since before the printing press. Although this falls under argumentum ad antiquitatem, again, it speaks to the practical need to center text.

Single Source

There is no way to write a single, pure Markdown file and generate centered text in a wide variety of output formats (from XHTML to ConTeXt) to meet various industry needs. Intermingling format-specific tags or macros defeats the purpose of a single source.

Professionalism

Professionally typeset manuscripts and screenplays require certain aspects to be centered.

Usage Barrier

Most people aren't programmers; there are people who don't have the capacity or desire to even learn how to mark up text. (This is why Microsoft Word and other such WYSIWYG word processing software is so popular and widely used.) Forcing people to embed fragments of indescribably foreign code to do something as simple as center (or justify) text builds a high barrier to using Markdown.

Microsoft's Stranglehold

Currently, Microsoft Word dominates manuscript submission requirements (Analog, Lightspeed, Mothership, Fantasy Scroll, Flash Fiction, New Accelerator, Tor, Asimov, and many others strongly prefer or outright demand .doc or .docx manuscript submission formats). This will not change until a viable, simpler standard is created.

Related

Most of the content in this comment is a summary of the following threads:

  • https://groups.google.com/forum/#%21searchin/pandoc-discuss/center/pandoc-discuss/CFHQvVqsTBs/Fq6_Z-MBxNIJ
  • https://groups.google.com/forum/#!topic/pandoc-discuss/bC33efYjtzQ
  • https://groups.google.com/forum/#!topic/pandoc-discuss/uOB_Da6sEbc

ghost avatar Dec 06 '15 22:12 ghost

I would also have noted that the formatting, not semantic, super- and sub-scripting exist already in pandoc and in various markdown flavors.

Jmuccigr avatar Dec 09 '15 14:12 Jmuccigr

People will want to write their centered text in the center, like this:

                              -> some centered text <-

|<>                              some centered text

rather than

-> some centered text <-

|<> some centered text

because it just looks better that way.

But with the mirrored arrows syntax, how can you tell if it's meant to be centered text or verbatim text?

Another aspect of |<> I like is that since that marker is placed at column 0, it's easier to see if I've entered two or one newlines after (or before) typing in the centered text content (presuming I'm writing it centered in the page, as described above).

uvtc avatar Dec 09 '15 15:12 uvtc

People will want to write their centered text in the center, like this:

I prefer the |<> syntax to the mirrored arrows, so long as the following produce equivalent results:

|<>Centered Text
|<> Centered Text
|<>                     Centered Text

| How can you tell if [mirrored arrows are] meant to be centered text or verbatim text?

Good point. Also, some text editors might confuse <- with HTML comments (<!--), thus making a mess of syntax highlighting.

What are the next steps? Vote for what syntax to use, get agreement from the core developers, develop a prototype integration, or something else?

ghost avatar Dec 10 '15 00:12 ghost

People like doing complex formatting, where markdown cannot

linquize avatar Dec 10 '15 10:12 linquize

People like doing complex formatting

What "people"? That presupposes a certain type of people who want to use Markdown. Name them. Here are classes of people whose writing or industry could benefit from the simplicity of Markdown formatting:

  • Technical writers
  • Publishers
  • Novelists
  • Biographers
  • Academics
  • Bloggers
  • Poets (?)

Here are classes of people whose work sometimes requires complex formatting, which makes Markdown unsuitable:

  • Recipe authors
  • Journalists
  • Children's authors
  • Screenplay writers (?)
  • Web masters

where markdown cannot

Nobody has drawn a line in the sand that states, "Markdown shall not do this." Markdown has no scope. Rather, subjective terms like "simple" and "human readable" attempt to reign in its extents. This will cause endless debate whenever additions are proposed.

The former planet Pluto was considered a planet until the word planet was given a formal definition. Markdown suffers in the same way: it lacks formal definition. That "Markdown cannot do this" doesn't mean "Markdown shall not do this."

Also... You say, "It can't." I say, "It can." That's not a debate with any technical merit or hints of rational thought: it's a pointless religious debate that cannot be resolved in any meaningful way.

ghost avatar Dec 10 '15 21:12 ghost

What are the next steps? Vote for what syntax to use, ... {snip}

There's no voting, per se. From what I've seen, feature requests may be left open for a while to give folks time to mull them over, discuss, and possibly create patches. Of course, it's easy to add features but difficult to remove them (once folks are using them). Also, some requested features would be extremely useful, but might make the pandoc-markdown format too noisy, or clash with some other existing usage, or break backcompat, or not work with all (enough?) output formats.

Nobody has drawn a line in the sand that states, "Markdown shall not do this." Markdown has no scope.

IMO, pandoc-markdown balances somewhere between capturing common usage (what you'd normally/naturally write in a plain text email or an online comment), and providing enough features to help you get away from Word/LaTeX/HTML. While looking good doing it. ;) A difficult task.

uvtc avatar Dec 10 '15 22:12 uvtc

What is the current status for this? Markdown centering is something I would really like to see.

It is very important for scientific content when you want to center figures for example!

choucavalier avatar Sep 17 '16 21:09 choucavalier

@tgy, as far as I remember, images accept attributes, so you can use attributes to center images.

I think it is way more important (and more useful) that paragraphs accept attributes than they may be aligned.

ousia avatar Sep 18 '16 08:09 ousia

I just found this issue, while looking for solutions to my problems. @DaveJarvis Thank You for a thorough essay on the rationale. I hope this feature will get implemented at some stage.

v1kn avatar Jan 06 '17 00:01 v1kn

Probably this should be tagged "AST Change".

ickc avatar Jan 06 '17 01:01 ickc

@ickc It's quite a while since the issue was opened. Do You reckon there's possibility the Devs will work on it?

v1kn avatar Jan 06 '17 01:01 v1kn

@v1kn,

I'm not the core-developers, so I can't tell. That said, I guess the priority will be low, considering that any "AST Change" level of issues are complicated and there are already many of them (in addition to the complexity involved, I would guess that a lot of thought would be given before any AST Change is made to get it right the first time). Just to illustrate the level of complexity: all reader/writer pairs for each format is needed to be updated, and depending on exactly what is involved, the tools built around pandoc needed to be updated too: pandoc-templates, pandoc-citeproc, any "filter framework" that parses the pandoc AST into its own, e.g. panflute, etc.

However, we can bend it such that it doesn't require an AST Change. e.g. just use a Div with class "center":

Cons:

  • uglier, generic syntax (but Div might soon receive a markdown syntax, see #168, which also means even if we settled for a Div syntax for now, potentially in the future a markdown-ish syntax suggested above could be implemented in the future.)

  • such a class is generic enough that existing documents might have already used it, so that such change might accidentally break someone else' documents when they update pandoc.

Pros:

  • no AST Change, much easier to implement, and from the history of similar commits, it can be started with some common, easier to implement output formats first (e.g. LaTeX and HTML related outputs). i.e. not all formats has to be implemented at the same time.

  • If one accept this route of using Div with class rather than a tailored syntax, then such feature can be implemented through a filter, completely bypass pandoc. It means you can start doing so right now, rather than waiting on a long discussion and approval process. Note that the filter as a proof of concept too, so it doesn't eliminate the possibility of pandoc supporting it natively.

ickc avatar Jan 06 '17 02:01 ickc

@ickc Thank You for a detailed explanation. Yes I can see now this is no easy task with fast solution.

I will try some workarounds using templates for the time being, but I will be sure to keep an eye on both this and #168 issue.

v1kn avatar Jan 06 '17 02:01 v1kn

We might implement this at some point, but I'd advise using filters for now. There are other AST changes, like colspans in tables, that are higher priority.

jgm avatar Jan 06 '17 11:01 jgm

If issue #168 is resolved, then this issue can be closed, as an additional syntax for centering would be superfluous:

; --- div {.centered}
; Lorem ipsum dolor sit amet, consectetur adipiscing
; elit, sed do eiusmod tempor incididunt ut labore et
; dolore magna aliqua. Ut enim ad minim veniam.

ghost avatar Jan 07 '17 04:01 ghost

If issue #168 is resolved, then this issue can be closed, as an additional syntax for centering would be superfluous

Not necessarily. It depends if one accepted the suggestion of using generic Div and a class to centering, or want a dedicated syntax for centering, similar to LineBlock (the original suggestions make comparison to LineBlock, and consider centering/LineBlock to be special cases of justifications).

In terms of AST, while one could imagine LineBlock syntax can be parsed as a Div with a certain class, but pandoc actually assigned a special element to it, which by the way cannot carries attributes at the moment. So to solve the centering problem in terms of AST, one can tries at least these methods:

  1. dedicated centering element (how about right justified then?)
  2. A Div with class to indicate it is centering
  3. Overload the LineBlock element, and let it carries attributes to indicate if it is left-justified (the old LineBlock), centered, or right-justified.

So when I suggest using a Div, it is the 2nd approach. But the 3rd approach could as well be a viable option, especially if the syntax of centering will be similar to LineBlock's.

So there're 2 related considerations:

  1. philosophically, does one want to generalize LineBlock to include centering/right-justification? (i.e. how one categories it.)

  2. syntactically, does one want to have a dedicated syntax for centering/right-justification? And if so, does one want to make them similar to LineBlocks?

If the answers are no to both, then using Div with classes is the simplest solution, and is a solution that can happens right now.

Lastly, I have a related question: what if one want to center/right-justified other kind of elements, say Headings? If this is desirable, then using a class can immediately generalize it to any pandoc elements that can carries attributes, while overloading LineBlock will limit it to just LineBlock-like texts.

ickc avatar Jan 07 '17 05:01 ickc

if one accepted the suggestion of using generic Div and a class to centering

Yes, this is the reason I included an example block, to show the syntax that would obviate this feature request:

; --- div {.centered}
...

what if one want to center/right-justified other kind of elements, say Headings

Headings and other elements can be styled without a specific class attribute because they can be uniquely referenced. This applies equally to HTML and TeX code. It's even possible to style the same heading level in different ways:

# First Heading
Paragraph
# Second Heading
Paragraph

Using CSS3, center every second heading as follows:

h1:nth-of-type(2) {
    text-align: center;
}

ghost avatar Jan 07 '17 20:01 ghost

@DaveJarvis,

; --- div {.centered}
...

compared to @uvtc's for example,

|<>               centered
|<>       also ignores leading space
|<>               last line

The latter is definitely better fitting the philosophy:

A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.

I remembered @jgm mentioned somewhere (probably in CommonMark when commenting on Microsoft's another attempt on a tool similar to pandoc/MultiMarkdown) that using English words to describe what it is is worse than using a syntax to show it (e.g. non-English documents). While we are talking about a class here (not just some syntax using English word), when someone read it as a plain-text, it is the same to them. i.e. all they read is just an English word.

Just to make it clear, I am not opposing to either approaches (that's why I suggested using a Div instead earlier). But I'm just pointing out the others might have other considerations. So we need to wait if they have any other opinions before closing it.

And even if one settled to use classes for centering, should one do it with Div, or LineBlock (LineBlock cannot carries attributes right now, but it seems that it will eventually be supported)? That's the "philosophical" part I mentioned above.

ickc avatar Jan 07 '17 20:01 ickc

LineBlock cannot carries attributes right now, but it seems that it will eventually be supported?

I understand, thank you for clarifying.

Readability, however, is emphasized above all else. A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.

That is an issue. Perhaps @uvtc, @jgm, or @jgruber might want to chime in with their thoughts on this potential divergence from Markdown philosophy.

ghost avatar Jan 07 '17 20:01 ghost

+++ ickc [Jan 07 17 12:26 ]:

I remembered @jgm mentioned somewhere (probably in CommonMark when commenting on Microsoft's another attempt on a tool similar to pandoc/MultiMarkdown) that using English words to describe what it is is worse than using a syntax to show it (e.g. non-English documents). While we are talking about a class here (not just some syntax using English word), when someone read it as a plain-text, it is the same to them. i.e. all they read is just an English word.

My main objection to the use of English words is that not everyone writes in English.

One nice thing about Markdown as opposed to, say, LaTeX, is that if you're writing in (say) Swedish, your document isn't littered with English words. It's ALL in Swedish, with some punctuation. People have sometimes commented on how nice this is.

jgm avatar Jan 07 '17 23:01 jgm

; --- {.poesi}
...
; --- {.poem}
...

How the "poem" is presented (centered, right-justified, etc.) is independent of the Markdown. Any language can denote that the paragraph is a stanza. That is to say, no formatting instructions are present, though {.poem} could be considered a tag, which would go against the philosophy.

ghost avatar Jan 08 '17 00:01 ghost

I came across this when looking for line block related issues to check if another issue I have had been reported. I have not read everything here (it's a lot!) but I thought that now when we have a nice div syntax what about making a class like .center (and perhaps also .centre to prevent problems due to human errors) on a div magical, so that writers output the right format dependent markup around it if possible?

In LaTeX that would presumably be the center environment, or a minipage with centering at the top (see here for the difference!)

I think that in HTML it should perhaps be up to the user to implement/link the necessary CSS based on the class rather than polluting the default template with more in-document CSS. Remember that it's hard or impossible to override in-document CSS, whether in the header or in a style attribute! The usually wanted CSS .center { text-align: center; } could be mentioned in the documentation of the feature.

bpj avatar Mar 12 '18 22:03 bpj

A workaround is to use a table with no header and a single centred column.

|     |
|:---:|
| This
| text
| is
| centered
This
text
is
centered

sjackman avatar Jan 15 '19 19:01 sjackman

Any updates on this? (how to center text on LaTeX via pandoc)

I wonder if these fenced_divs can be used somehow: https://github.com/jgm/pandoc/issues/4037 Something like:

::: center
             test
:::

Or pehaps this pandoc-latex-environment, to somehow convert a div class "center" to the \begin{center} ... on LaTeX: https://github.com/chdemko/pandoc-latex-environment/wiki

---                           
pandoc-latex-environment:
  center: [classcenter]
---
<div class="classcenter">content</div>

in LaTeX ->

\begin{center}
content
\end{center}

Sorry, I don't know much of the internals of pandoc (yet), and couldn't make any of them work... can someone give a clue on the latest advances here?

igormcoelho avatar Feb 28 '19 20:02 igormcoelho

@igormcoelho, you can create a simple Lua script to convert custom blocks to centred text. Here's an example that creates a custom inline image based on a fenced div:

local lines_to_blocks = {
  Image = function( el )
    return {
      pandoc.RawInline( "tex", "\\inlineexternalfigure[" .. el.src .. "]" )
    }
  end
}

function Div( el )
  local kls, _ = el.classes:find_if(
    function ( c )
      return string.match( c, "^image%-" )
    end
  )

  if kls then
    return pandoc.walk_block( el, lines_to_blocks )
  end
end

See also @jgm's https://github.com/jgm/pandoc/issues/2106#issuecomment-371508848.

Save the file as centre.lua then call pandoc using:

pandoc --lua-filter=centre.lua

This then parses the following block:

::: image-inline
![](../climate/graph)
:::

To produce:

\inlineexternalfigure[../climate/graph]

Feel free to use this example to produce centred LaTeX code. Note that RawInline may have to be changed to RawBlock.

ghost avatar Mar 01 '19 00:03 ghost

Thanks @DaveJarvis for the fast and detailed response. I can try to adapt this code, but my need involves centering an equation block on LaTeX, not an image... My block is specifically:

(@obj) $maximize \sum_{i \in X} p_{i}$

In fact, I could use \begin{equation} ... here, I would be centered in LaTeX, but totally unreadable for markdown, and I want to do that in a way that is compatible with markdown too, because we have both users editing the document. Anyway, I'll try to change your script to RawBlock and learn a little bit of lua :)

igormcoelho avatar Mar 01 '19 16:03 igormcoelho