otterwiki icon indicating copy to clipboard operation
otterwiki copied to clipboard

Consider support for HTML attribute list syntax

Open dwgill opened this issue 6 months ago • 2 comments

In the thread https://github.com/redimp/otterwiki/issues/215 @redimp menntioned the following:

I never read anywhere of a markdown extension that is mentioned in the GeekforGeeks article. There is also not mentioned which markdown parsers support this feature, I guess at least their very own parser? Some stackoverflow posting mention that kramdown supports it, but I can not find it in their docs: https://kramdown.gettalong.org/syntax.html

For whatever it's worth, this is a very common syntax in certain communities of Markdown users, and I'd like to explain what benefits it could offer and propose its eventual support in Otterwiki. I would also be happy to help with implementing it if that moves the needle in considering its adoption.

What is this syntax, where does it come from, and where is it implemented?

I'm most familiar with it from Pandoc, where the syntax extensively used and ends up being an invaluable tool for enabling authors to just slightly adjust the formatting of epubs/html output from their Markdown source without resorting to in-lining the html. The basic gist is that you can attach a list of attributes to the end of certain Markdown elements that will then be incorporated into the output HTML. Most common is CSS classes and IDs, which are given a shorthand syntax.

For example, the following attaches an ID and a CSS class to the corresponding <p> element:

Input:

This is a paragraph.
{ #an_id .a_class }

Output:

<p id="an_id" class="a_class">This is a paragraph.</p>

This syntax can also be applied to inline elements. The following sets some CSS classes on a link in addition to applying the value "_blank" to the target attribute. The target attribute demonstrates the generic syntax for specifying any arbitrary HTML element.

Input:

[link](http://example.com){ .foo .bar target="_blank" }'

Output:

<a class="foo bar" href="http://example.com" target="_blank">link</a>

Pandoc seems to have inherited this syntax from PHP Markdown Extra.

As far as Python implementations go, this syntax is supported in the Python Markdown package as part of a larger set of extensions that imitate PHP Markdown Extra specifically. Python Markdown itself attributes the genesis of this syntax to this proposal here.

Whether Otterwiki should migrate to that package is its own conversation, as I haven't confirmed whether it supports all existing Markdown extensions currently used by Otterwiki. But I would be happy to dig into a larger exploration regarding this feature's implementation and the future of Otterwiki's Markdown support if you're interested.

Why should support for this syntax be considered for Otterwiki?

Speaking as a heavy Pandoc user, I would offer that authors of epubs have some needs in common with users of wikis. In particular, I've noticed a phenomenon like the following:

  • The vast majority of content is perfectly well captured with simple plain text.

  • And most of everything else most people will need regarding formatting and layout is captured by the standard Markdown syntax: _italics_, **bold**, # Headings, [links](https:...) and so on. This stuff is covering the common denominators of usage.

  • That said, I find that a substantial number of users will still have various formatting and layout needs that are

    1. unhandled by the common Markdown syntax,
    2. but occur frequently enough in their own documents to make it frustrating to resort to HTML each time,
    3. and yet are idiosyncratic requirements—that is, while some specific minority of users are actively frustrated by needing to resort to inline HTML for a some favored bit formatting, it's likely the majority of users will never even notice its absence.

    Good examples of this might include underlining text, changing the font, changing the text color, in-lining images, floating images to the left or right, and changing the text alignment for a given particular block (such as when a user may wish to include a paragraph of some RTL language in their larger LTR language page).

Hopefully this illustrates the point I'm making, but to put it in other words: you're likely to keep getting requests for supporting dedicated syntax for increasingly niche formatting use cases—requests you can legitimately decline because each feature is irrelevant to most users, but at the same time a daily necessity for a different vocal minority. After all, people don't typically request special syntax for formatting they don't consider core to their writing experience.

I would propose that the chief benefit of HTML attribute lists is that it gives users a generic/universal syntax for supporting an open-ended variety of layouts and formatting.

With this proposed syntax, users who have some niche formatting requirements can now add some custom CSS styling to their wiki and from there on just append a class on the right elements as needed. For example, a user who wants to regularly embed foreign-language RTL languages into their wiki can add some custom CSS and from there on just append { .rtl } at the end of relevant paragraphs. At the same time, users who want to specify a particular link should open in a new tab can just append { target="_blank" } and go on with their day. Both of these cases are a bit clunky in and of themselves, but still an improvement over the alternative of embedding the raw HTML, and neither required bespoke syntax uniquely tailored for that particular scenario.

In effect, this represents a single syntax extension that allows users with divergent and niche formatting needs to just go implement their own solutions with a little CSS, and without having to e.g. fork the Otterwiki source.

Images present an especially beneficial use case

So far as I know, Otterwiki does not have first-class support for doing anything special with the presentation or layout images, with the exception of the thumbnail query param (which is admittedly quite useful as far as it goes).

I propose that this HTML attributes list syntax could offer a pretty simple way of giving users some control over the presentation and positioning of images without getting too crazy with regards to syntax. For example, if I want to float an image to the right such that it renders in-line and the text flows around the image, I might do the following:

Input:

![alt text](./path-to-my-image){ .float-right }

Output:

<img alt="alt text" class="float-right" src="./path-to-my-image" />

If instead I wanted to center the image horizontally:

Input:

![alt text](./path-to-my-image){ .center }

Output:

<img alt="alt text" class="center" src="./path-to-my-image" />

Obviously float-right and center are just arbitrary CSS classes, but we could offer some default styling rules and basic classes for images for this purpose. Or not—we could just leave the option for users to do themselves. As said before, I think the great benefit of supporting this syntax is that it potentially addresses a lot of divergent formatting needs with a single pretty straightforward syntax.

But I do think image presentation and position is likely to be the most common daily use case that would benefit most immediately from this. Images at present are the biggest pain point for me with Otterwiki: they render as clunky block elements, and do not flow neatly inline with surrounding text as average users expect from e.g. Wikipedia. Obviously today I can just input an <img /> tag and do the styling myself, but as someone who hopes to eventually have his nontechnical family members able to contribute to the shared wiki, I place a premium on staying as close to Markdown as possible and avoiding inline HTML as much as feasible. I believe this syntax accomplishes that goal.

Anyway, that's just my perspective—thanks for taking the time to read and consider it. Happy to take any questions regarding this and, if you're interested, I would also be happy to help with implementing it.

dwgill avatar Jun 23 '25 15:06 dwgill

Hey @dwgill, thanks for the effort of explaining the kramdown features and demo-ing the examples. I understand how this could be very useful, als alternate to writing html directly for special formatting.

Unfortunately I'm afraid that changing/extending the parser massivly doesn't fit into my schedule right at the moment. Please don't let that stop you from developing a proof of concept, I would love to see a fork with it, please ping me in that case, so that I dont miss anything you invest time in.

[..] Otterwiki does not have first-class support for doing anything special with the presentation or layout images [..]

That's correct and a feature gap that I would like to fill.

[..] Images at present are the biggest pain point for me with Otterwiki [..]

I can relate. Currently I'm looking into developing an extension of the syntax with the working title "embeddings". These should for example add an image frame with caption and automatic thumbnail. Also for embedding video attachments or tables (rendered from a e.g. .csv attachment) -- Please see this as an additional feature, not as deflection of your request.

redimp avatar Jun 23 '25 17:06 redimp

@redimp appreciate the response! I definitely understand scheduling being heavy. I'll try and take a look at some implementations of this over the next few weeks. Are you committed to mistune, or would you be open to considering alternative markdown engines? e.g. I suspect an actual AST parser would be helpful for tackling issues like https://github.com/redimp/otterwiki/issues/228

dwgill avatar Jun 24 '25 13:06 dwgill