blackfriday icon indicating copy to clipboard operation
blackfriday copied to clipboard

Specifying different characters for opening and closing smart quotes

Open saadatm opened this issue 7 years ago • 4 comments

I was wondering if there could be a way to specify desired characters (or strings) for opening and closing smart quotes when HTML_USE_SMARTYPANTS is enabled.

This would come in handy for right-to-left (RTL) languages where should be the opening double quote and should be the closing one. (Similarly, and will be the opening and closing single quotes.)

Besides RTL languages, this could also be useful for a variety of other languages or scripts.

saadatm avatar Feb 20 '18 09:02 saadatm

From the API point of view, it makes perfect sense.

I have little confidence in my understanding how RTL languages are processed. Am I right that the caller would swap the quotes around? Meaning that a sequence of bytes foo "bar" baz, whith angle quotes applied, would be rendered like this, respectively:

LTR RTL
foo <bar> baz zab <rab> oof

This leads to a concern regarding practicality. This request implies the calling code knows the language of the source document. This means that the code (or at least some sort of config for that code) will be custom-tailored to the content. Wouldn't this be better left to styling?

rtfb avatar Mar 04 '18 11:03 rtfb

I am suggesting that some sort of configuration options be introduced, which let the user override the default opening and closing smart quotes (something like what is being done in Python-Markdown). This way, Blackfriday doesn't need to know the language of the content and just use the characters (or strings) that may be supplied by the user for smart quotes substitutions.

(Sidenote: The example of angle quotes in an RTL context is actually interesting, because they fall under the category of "mirrored characters" --- which means that « and » can be used as opening and closing quotes respectively in both LTR and RTL contexts, and the rendering engine takes care of using the glyphs appropriate for the text direction. The "regular" smart quotes, “...”, cannot be mirrored and need to be swapped explicitly in an RTL context, hence this feature request. As I said previously, however, this will be useful for other LTR quote combinations too, such as „...” or „...“.)

saadatm avatar Mar 05 '18 17:03 saadatm

Disclaimer: This is the first time I work with go, so this is mostly based on trial and error.

The code in its current state seems to assume the processed text is English, even though there is an option to activate French guillemets (i.e. « xyz »).

Quotes are composed of 'l' for opening quotes and 'r' for closing quotes and then a language specific character (or 's' for single quotes), finally, 'quo;' is added to the output string.

https://github.com/russross/blackfriday/blob/11635eb403ff09dbc3a6b5a007ab5ab09151c229/smartypants.go#L106-L112

This works fine for languages where this is true, but needs several changes for languages that use different quotes. German, for example (and the reason I noticed this in the first place), uses "&bdquo; xyz &ldquo;". Also, single quotes in German look like this: "&sbquo; xyz &lsquo;".

Now there's also some assumptions about text when using single quotes: https://github.com/russross/blackfriday/blob/11635eb403ff09dbc3a6b5a007ab5ab09151c229/smartypants.go#L135-L148

This always outputs "&rsquo;" for contractions like "we'll". This is probably a good idea for many languages since this mostly occurs when quoting English text (also, having a flag to disable this kind of processing would probably be trivial).

Can this be implemented without breaking compatibility? The way I read the code, adding new flags should be relatively easy and could be used in other software like Hugo with few changes. However, passing things like extra characters seems to be a bigger task. Any suggestions? :)

bocki avatar Jun 11 '18 21:06 bocki

Any updates on this?

exploids avatar Sep 11 '19 12:09 exploids