smartquotes.js
smartquotes.js copied to clipboard
Dashes and ellipses
Would you consider adding support for converting --
into an en dash (if between numbers, days, or months) or an em dash (otherwise), and ...
into a proper ellipse? Or is that out of scope? (It certainly goes beyond the name of the project, but is in the same spirit.)
I personally think that's a great idea. It realistically should be able to handle all those sorts of terrible typing choices.
Is that a hint that I should pull request it? :-) I can give it a shot...
If you've got time, go for it :) I think the ellipsis addition should be fairly straightforward, but the en/em dash one could be a little tricky to account for dates and such, although it'd be really classy looking. I should have some time to look at it in the next few days, too.
I'd be willing to take this one on.
-
3-3
&3--3
->3–3
-
something --- not sure what
&&something---not sure what
->something—not sure what
Only problem is that I have seen people write -- when they want an em dash. What then? one -- three
seems valid (although weird), but it's difficult to know whether they want an em or en dash. I would suggest just blindly converting --
into an en dash and ---
into an em dash, but maybe we want something smarter.
Also, what about another PR to turn ```this crap''(minus the
`; can't work out how to escape it) into proper curly quotes?
EDIT: Didn't see you'd already attempted this in #5, sorry!
See comments on PR #5.
I'm reopening this due to not implementing anything else and the re-emergence of this topic here: https://github.com/kellym/smartquotesjs/issues/28.
@kellym First, thanks for caring about typography.
Fascinating (and exhausting) rabbit hole — user experience being convoluted.
Curious, the macOS and Medium implementations of smart quotes are both proprietary right?
Have you guys stumbled upon standardized regular expressions to support en dash, em dash and ellipsis (among others)?
It would be nice to have parity with these “standardized” implementations.
I've added a few replacements that cover this, as well as some other common symbols. This uses the replacements hook mentioned in this comment.
smartquotes.replacements.push([/\-\-\-/g, '\u2014']); // em dash
smartquotes.replacements.push([/\-\-/g, '\u2013']); // en dash
smartquotes.replacements.push([/\.\.\./g, '\u2026']); // ellipsis
smartquotes.replacements.push([/\(c\)/gi, '\u00A9']); // copyright
smartquotes.replacements.push([/\(r\)/gi, '\u00AE']); // registered trademark
smartquotes.replacements.push([/\?!/g, '\u2048']); // ?!
smartquotes.replacements.push([/!!/g, '\u203C']); // !!
smartquotes.replacements.push([/\?\?/g, '\u2047']); // ??
smartquotes.replacements.push([/([0-9]\s?)\-(\s?[0-9])/g, '$1\u2013$2']); // number ranges use en dash
I was going to create a new issue, but maybe it's better to piggyback on this issue since it's so closely related.
I realize symbols are out of scope for this library, but it's very nice to be able to extend it and smartquotes does a great job of ignoring things like <code>
blocks.
One thing I wish I could do more easily is remove custom replacements. For example, if I'm running smartquotes over a number of files and need the config to vary for each file, I currently need to keep track of each replacement I add and attempt to remove them from the array afterwards.
Without knowing how smartquotes is architected, I can think of two possible solutions for this. One is an object-oriented approach:
const sq1 = new SmartQuotes({ /* first config */ });
sq1.element(el1);
const sq2 = new SmartQuotes({ /* second config */ });
sq2.element(el2);
If this requires a lot of refactoring, perhaps another solution is to expose a second replacements array called customReplacements
that we can safely destroy.
smartquotes.customReplacements = [ /* custom replacements */ ];
smartquotes.element(el);
smartquotes.customReplacements = []; // remove all custom replacements without borking defaults
My thinking is that it's probably easier to concatenate replacements
and customReplacements
than refactor to support an OOP approach.
This isn't a show stopper at all — I can work around it, just not elegantly. I also appreciate if this is way out of scope, but it's definitely a "nice to have" for custom replacements.
@claviska Just a quick tip that dashes don't need to be escaped in regular expressions unless they appear in []
blocks, like [\-a-z]
.