marked
marked copied to clipboard
Token Relational Properties
Just a base attempt at adding some sibling properties to tokens, based on discussion #2097 . Builds in a previousSibling
and nextSibling
parameter to each token so you can more easily traverse the token tree (i.e., when building an Extension or using Walktokens). Next steps might include parent
and children
parameters as well.
Does break a lot of unit tests where a token is supposed to have no parameters.
A couple benchmark runs show that this might be a bit slower but it's hard to tell by how much. Just want rough feedback if this is even a good way to do it.
Contributor
- [ ] Test(s) exist to ensure functionality and minimize regression (if no tests added, list tests covering this PR); or,
- [ ] no tests required for this PR.
- [ ] If submitting new feature, it has been documented in the appropriate places.
Committer
In most cases, this should be a different person than the contributor.
- [ ] CI is green (no forced merge required).
- [ ] Squash and Merge PR following conventional commit guidelines.
This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.
🔍 Inspect: https://vercel.com/markedjs/markedjs/ifoSCuZ9R1JqWDfVPbDEsPgzMA3T
✅ Preview: https://markedjs-git-fork-calculuschild-tokenrelational-092c6a-markedjs.vercel.app
This still seems to me like it can be solved better by providing all of the tokens in a hook. I feel like the amount of work needed to make this work would not be worth it since it is such a rare thing in markdown to make rules based on sibling or parent tokens.
it can be solved better by providing all of the tokens in a hook.
That would be solving a different problem though. It is related, but not directly the purpose of this PR.
-
Providing tokens in a hook SOLVES: Users do not have a way to access the token tree without splitting apart the render steps.
-
Providing sibling tokens SOLVES: Users do not have a straightforward way to access sibling tokens.
Hooks would allow a user to code up his own solution to 2), but it does not provide the solution to 2) directly.
it is such a rare thing in markdown to make rules based on sibling or parent tokens.
But it does happen, and coding around those cases even in Marked.js is obnoxious. If you have a code block, you have to check the previous sibling to make sure it's not interrupting a paragraph. If you have a text token, you need to check if the previous sibling is also text so you can merge them. In a blockquote token, you need to know if the parent was at the top level or not to decide how to parse the internal tokens. A list needs to know if its children items have blank lines after being parsed before it knows if it is "loose" or not.
the amount of work needed to make this work would not be worth it
But the work is already in the PR? At least for sibling tokens, we already check for lastToken = tokens[tokens.length - 1]
multiple times, and this just captures that work we are already doing and saves it.
I would also be happy to stop with just sibling tokens without adding children
and parents
because we already have the child tokens
properties that are user-accessible.
But I agree that at a minimum we should expose tokens via hooks so users have at least some way to modify the token tree.
I'm going to close this as it can be done with the processAllTokens
hook in an extension.
Something like:
import { Marked } from 'marked';
function tokenRelationalProperties() {
function setChildren(parentToken) {
switch (parentToken.type) {
case 'table': {
for (const header of parentToken.header) {
setProperties(parentToken, header.tokens);
header.tokens.forEach(setChildren);
}
for (const row of parentToken.rows) {
for (const cell of row) {
setProperties(parentToken, cell.tokens);
cell.tokens.forEach(setChildren);
}
}
break;
}
case 'list': {
setProperties(parentToken, parentToken.items);
parentToken.items.forEach(setChildren);
break;
}
default: {
if (marked.defaults.extensions?.childTokens?.[parentToken.type]) {
marked.defaults.extensions.childTokens[parentToken.type].forEach((childTokens) => {
setProperties(parentToken, parentToken[childTokens]);
parentToken[childTokens].forEach(setChildren);
});
} else if (parentToken.tokens) {
setProperties(parentToken, parentToken.tokens);
parentToken.tokens.forEach(setChildren);
}
}
}
}
function setProperties(parent, tokens) {
for (let i = 0; i < tokens.length; i++) {
const token = tokens[i];
token.parent = parent;
token.nextSibling = tokens[i + 1];
token.previousSibling = tokens[i - 1];
}
}
return {
hooks: {
processAllTokens(tokens) {
setProperties(null, tokens);
tokens.forEach(setChildren);
return tokens;
}
}
};
}
const marked = new Marked(
{
hooks: {
processAllTokens(tokens) {
console.log(tokens);
return tokens;
}
}
},
tokenRelationalProperties()
);
const html = marked.parse(`
# test **markdown**
with a [link](https://github.com/markedjs/marked)
---
- and
- a
- list
| and | with |
| --- | ----- |
| a | table |
`);
console.log(html);