cmark-gfm icon indicating copy to clipboard operation
cmark-gfm copied to clipboard

Recognize following non-ASCII punctuation in extended www autolink

Open tats-u opened this issue 4 months ago • 0 comments

Autolink recognizes only ASCII punctuation as a terminator of the range. It is incompatible with Chinese and Japanese.

ASCII parenthesis (https://example.com)

<!-- Japanese -->

全角カッコ(https://example.com)

See https://example.com.

<!-- Chinese -->

参见https://example.com。

There are some helpful data in Unicode:

  • https://www.unicode.org/Public/17.0.0/ucd/auxiliary/SentenceBreakProperty.txt
  • https://www.unicode.org/Public/17.0.0/ucd/BidiBrackets.txt

tats-u avatar Nov 06 '25 14:11 tats-u