mataroa icon indicating copy to clipboard operation
mataroa copied to clipboard

Support Pandoc preprocessing / enhance MD support

Open savchenko opened this issue 5 years ago • 9 comments

- snip -

savchenko avatar Jul 02 '20 02:07 savchenko

Hi Andrew,

This project actually uses this package for generating HTML from markdown. It does indeed support the original markdown spec, rather than GFM. However, I've just enabled some more extensions and removed some elements from the HTML sanitisation operation. Namely, these are now working:

  • footnotes
  • hr
  • mark
  • sup
  • the complete list: https://github.com/sirodoht/mataroa/blob/9ddce2e3d1e23c80c72fdd6363bc21a54d910eaf/main/helpers.py#L129-L199

sirodoht avatar Jul 06 '20 23:07 sirodoht

Looks much better!

However, this doesn't solve typography/typesetting. E.g. -- isn't converted to — and text isn't aligned horizontally.

Example of well-aligned text: 394px-Oscar_wilde_english_renaissance_of_art_2

savchenko avatar Jul 07 '20 12:07 savchenko

Hmm, is -- converting to an em dash part of GFM or another markdown spec? You can always use the literal em dash character — inside your markdown text.

As far as well-aligned text, are you referring to justified alignment? As in flush both left and right? I am not a fan of that style due to space irregularity between words.

sirodoht avatar Jul 07 '20 18:07 sirodoht

Regarding the first point: https://pandoc.org/MANUAL.html#typography

On the second one, let's open pretty much any printed book and look inside... Looks rather justified and well-balanced, right? People have spent significant amount of time figuring out how to help fellow humans to read text. I just humbly suggest to take an advantage of the pre-existing knowledge.

savchenko avatar Jul 08 '20 12:07 savchenko

For typography, my authority is Practical Typography by Matthew Butterick: https://practicaltypography.com/

So, on multiple-hyphens-as-dashes the book's decree is that it's a bad typewriter habit (see number 3 here: https://practicaltypography.com/typewriter-habits.html)

On justified text what the book says is that it's a matter of preference rather than indication of quality. I agree, justified text might look better, but I really don't like when the spaces between the words become too coarse. And the reason is this:

Keep in mind that the justification engine of a word processor or web browser is rudimentary compared to that of a professional page-layout program. So if I’m making a word-processor document or web page, I’ll always left-align the text, because justification can look clunky and coarse. Whereas if I’m using a professional layout program, I might justify.

From https://practicaltypography.com/justified-text.html

sirodoht avatar Aug 30 '20 12:08 sirodoht

I am not aware of mr. Butterick, but rumours are, Jan Tschichold's "The New Typography" still holds.

So, on multiple-hyphens-as-dashes the book's decree is that it's a bad typewriter habit

Precisely. This is why -- should be converted to — and hyphens shall be contextually "upgraded" to en/em dash where appropriate.

savchenko avatar Aug 30 '20 13:08 savchenko

Regarding the implementation, you might consider Python MD extensions. For example, Smarty might be a good start.

savchenko avatar Sep 04 '20 11:09 savchenko

Precisely. This is why -- should be converted to — and hyphens shall be contextually "upgraded" to en/em dash where appropriate.

My problem with that is sometimes people might want to just write literal two or three hyphens in a row, -- or ---. I don't want to upgrade those when they didn't mean to. I have experienced that and it was annoying.

Typing the literal em and en dashes (here's a how-to) seems to allow the user more typographic control.

sirodoht avatar Jun 15 '21 15:06 sirodoht

thanks for pointing to the markdown package, [TOC] generates TOC automatically

ahxxm avatar May 30 '22 12:05 ahxxm