commonmark.js icon indicating copy to clipboard operation
commonmark.js copied to clipboard

Sourcepos not defined for inline elements

Open okt17 opened this issue 6 years ago • 3 comments

Trying to parse this file with commonmark as follows:

    const parser = new commonmark.Parser({ sourcepos: true });
    const result = parser.parse(source);

Where source is:

# API name

# Group Users

## User [/users/{id}]

+ Parameters
    + id: 23 (enum[number], optional) - Database ID

        Additional description

        + Default: 1

        + Members
            + 37 - Testing value
            + 1
            + 23

### Retrieve User [GET]

+ Response 200 (application/json)

        {}

The node with Additional description literal ends up having undefined sourcepos. According to the description, sourcepos property is listed under "The following public properties are defined".

The resulting node is:

Node {
  _type: 'text',
  _parent:
   Node {
     _type: 'paragraph',
     _parent:
      Node {
        _type: 'item',
        _parent: [Object],
        _firstChild: [Object],
        _lastChild: [Object],
        _prev: null,
        _next: null,
        _sourcepos: [Array],
        _lastLineBlank: true,
        _open: false,
        _string_content: '',
        _literal: null,
        _listData: [Object],
        _info: null,
        _destination: null,
        _title: null,
        _isFenced: false,
        _fenceChar: null,
        _fenceLength: 0,
        _fenceOffset: null,
        _level: null,
        _onEnter: null,
        _onExit: null },
     _firstChild: [Circular],
     _lastChild: [Circular],
     _prev:
      Node {
        _type: 'paragraph',
        _parent: [Object],
        _firstChild: [Object],
        _lastChild: [Object],
        _prev: null,
        _next: [Circular],
        _sourcepos: [Array],
        _lastLineBlank: true,
        _open: false,
        _string_content: null,
        _literal: null,
        _listData: {},
        _info: null,
        _destination: null,
        _title: null,
        _isFenced: false,
        _fenceChar: null,
        _fenceLength: 0,
        _fenceOffset: null,
        _level: null,
        _onEnter: null,
        _onExit: null },
     _next:
      Node {
        _type: 'list',
        _parent: [Object],
        _firstChild: [Object],
        _lastChild: [Object],
        _prev: [Circular],
        _next: null,
        _sourcepos: [Array],
        _lastLineBlank: true,
        _open: false,
        _string_content: '',
        _literal: null,
        _listData: [Object],
        _info: null,
        _destination: null,
        _title: null,
        _isFenced: false,
        _fenceChar: null,
        _fenceLength: 0,
        _fenceOffset: null,
        _level: null,
        _onEnter: null,
        _onExit: null },
     _sourcepos: [ [Array], [Array] ],
     _lastLineBlank: true,
     _open: false,
     _string_content: null,
     _literal: null,
     _listData: {},
     _info: null,
     _destination: null,
     _title: null,
     _isFenced: false,
     _fenceChar: null,
     _fenceLength: 0,
     _fenceOffset: null,
     _level: null,
     _onEnter: null,
     _onExit: null },
  _firstChild: null,
  _lastChild: null,
  _prev: null,
  _next: null,

  _sourcepos: undefined,

  _lastLineBlank: false,
  _open: true,
  _string_content: null,

  _literal: 'Additional description',

  _listData: {},
  _info: null,
  _destination: null,
  _title: null,
  _isFenced: false,
  _fenceChar: null,
  _fenceLength: 0,
  _fenceOffset: null,
  _level: null,
  _onEnter: null,
  _onExit: null }

Expected: node.sourcepos contains an array Actual: node.sourcepos is undefined

okt17 avatar Jan 29 '19 09:01 okt17

Currently only nodes with block-level elements (paragraphs, code blocks, etc.) get sourcepos set. Not inlines. It would be good to add sourcepos to inlines; this has now been done in cmark, which has largely parallel code.

jgm avatar Jan 10 '20 17:01 jgm

I've just ran into this limitation in my own project. I've had a look to see if I could put this in but I'm not familiar enough with the codebase. It looks like the test suite only tests parsing the Markdown and generating Javascript; I can't see any coverage of the intermediate products.

jameswilddev avatar Nov 21 '21 16:11 jameswilddev

In case you want to work on it, here is what needs to be done. (I'd suggest looking at the parallel change in cmark too.)

Each block node that will eventually contain inlines contains a _string_content field that holds the unparsed string content. This gets passed to parseInlines (in lib/inlines.js). But because this comes with no source position information, the inline parser can't record this. The change required is this: instead of adding new lines to _string_content, the block parser would have to update an array of (sourcepos, string) pairs, one for each line. (The sourcepos would indicate where the string starts on the line.) The inline parser would then need to be taught how to consume this array (instead of a plain string) and track the source position.

jgm avatar Nov 21 '21 17:11 jgm