marked icon indicating copy to clipboard operation
marked copied to clipboard

Token Relational Properties

Open calculuschild opened this issue 3 years ago • 4 comments

Just a base attempt at adding some sibling properties to tokens, based on discussion #2097 . Builds in a previousSibling and nextSibling parameter to each token so you can more easily traverse the token tree (i.e., when building an Extension or using Walktokens). Next steps might include parent and children parameters as well.

Does break a lot of unit tests where a token is supposed to have no parameters.

A couple benchmark runs show that this might be a bit slower but it's hard to tell by how much. Just want rough feedback if this is even a good way to do it.

Contributor

  • [ ] Test(s) exist to ensure functionality and minimize regression (if no tests added, list tests covering this PR); or,
  • [ ] no tests required for this PR.
  • [ ] If submitting new feature, it has been documented in the appropriate places.

Committer

In most cases, this should be a different person than the contributor.

calculuschild avatar Jun 13 '21 02:06 calculuschild

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/markedjs/markedjs/ifoSCuZ9R1JqWDfVPbDEsPgzMA3T
✅ Preview: https://markedjs-git-fork-calculuschild-tokenrelational-092c6a-markedjs.vercel.app

vercel[bot] avatar Jun 13 '21 02:06 vercel[bot]

This still seems to me like it can be solved better by providing all of the tokens in a hook. I feel like the amount of work needed to make this work would not be worth it since it is such a rare thing in markdown to make rules based on sibling or parent tokens.

UziTech avatar Jun 22 '21 23:06 UziTech

it can be solved better by providing all of the tokens in a hook.

That would be solving a different problem though. It is related, but not directly the purpose of this PR.

  1. Providing tokens in a hook SOLVES: Users do not have a way to access the token tree without splitting apart the render steps.

  2. Providing sibling tokens SOLVES: Users do not have a straightforward way to access sibling tokens.

Hooks would allow a user to code up his own solution to 2), but it does not provide the solution to 2) directly.

it is such a rare thing in markdown to make rules based on sibling or parent tokens.

But it does happen, and coding around those cases even in Marked.js is obnoxious. If you have a code block, you have to check the previous sibling to make sure it's not interrupting a paragraph. If you have a text token, you need to check if the previous sibling is also text so you can merge them. In a blockquote token, you need to know if the parent was at the top level or not to decide how to parse the internal tokens. A list needs to know if its children items have blank lines after being parsed before it knows if it is "loose" or not.

the amount of work needed to make this work would not be worth it

But the work is already in the PR? At least for sibling tokens, we already check for lastToken = tokens[tokens.length - 1] multiple times, and this just captures that work we are already doing and saves it.

I would also be happy to stop with just sibling tokens without adding children and parents because we already have the child tokens properties that are user-accessible.

calculuschild avatar Jun 23 '21 00:06 calculuschild

But I agree that at a minimum we should expose tokens via hooks so users have at least some way to modify the token tree.

calculuschild avatar Jun 23 '21 01:06 calculuschild

I'm going to close this as it can be done with the processAllTokens hook in an extension.

Something like:

import { Marked } from 'marked';

function tokenRelationalProperties() {
  function setChildren(parentToken) {
    switch (parentToken.type) {
      case 'table': {
        for (const header of parentToken.header) {
          setProperties(parentToken, header.tokens);
          header.tokens.forEach(setChildren);
        }
        for (const row of parentToken.rows) {
          for (const cell of row) {
            setProperties(parentToken, cell.tokens);
            cell.tokens.forEach(setChildren);
          }
        }
        break;
      }
      case 'list': {
        setProperties(parentToken, parentToken.items);
        parentToken.items.forEach(setChildren);
        break;
      }
      default: {
        if (marked.defaults.extensions?.childTokens?.[parentToken.type]) {
          marked.defaults.extensions.childTokens[parentToken.type].forEach((childTokens) => {
            setProperties(parentToken, parentToken[childTokens]);
            parentToken[childTokens].forEach(setChildren);
          });
        } else if (parentToken.tokens) {
          setProperties(parentToken, parentToken.tokens);
          parentToken.tokens.forEach(setChildren);
        }
      }
    }
  }

  function setProperties(parent, tokens) {
    for (let i = 0; i < tokens.length; i++) {
      const token = tokens[i];
      token.parent = parent;
      token.nextSibling = tokens[i + 1];
      token.previousSibling = tokens[i - 1];
    }
  }

  return {
    hooks: {
      processAllTokens(tokens) {
        setProperties(null, tokens);
        tokens.forEach(setChildren);
        return tokens;
      }
    }
  };
}

const marked = new Marked(
  {
    hooks: {
      processAllTokens(tokens) {
        console.log(tokens);
        return tokens;
      }
    }
  },
  tokenRelationalProperties()
);

const html = marked.parse(`
# test **markdown**

with a [link](https://github.com/markedjs/marked)

---

- and
- a
- list

| and | with  |
| --- | ----- |
|  a  | table |
`);

console.log(html);

UziTech avatar Apr 05 '24 06:04 UziTech