hexo
hexo copied to clipboard
Async rendering of tags within a post
What does it do?
Fix https://github.com/hexojs/hexo/issues/4923.
Currently, if we use async rendering in custom tag plugins, rendering of tags can run concurrently between posts. However, tags within the same post are still rendered sequentially.
This is because when rendering a post, hexo makes a single render call to nunjucks to render the entire post. Inside nunjucks, it renders all the tags sequentially.
Instead of making one render call per-post, this PR does the following:
- split the post into chunks
- render each chunk separately, with async call
This way, tags in different chunks will be rendered concurrently. This has accelerated my whole-site generation by 10x because I have some expensive tags.
The following changes in this PR are also necessary:
- Add a HTML parser,
parse5
as a dependency.parse5
is pretty popular and is also the backend ofcheerio
. Let me know if a different parser is preferred. - Modify a test because the output is changed. Note that the new output is equivalent to the old output.
Screenshots
N/A
Pull request tasks
- [ ] Add test cases for the changes.
- [ ] Passed the CI test.
Things I'm not sure of:
- Should this be made a per-site option, per-post option, or no option at all?
- Will this affect rendering results? I found that in nunjucks, users can define variables in one tag and use it in another. If each tag is rendered independently, this might stop working. However, I don't know if this matters at all because this "feature" doesn't seem useful to hexo.
How to test
git clone -b master https://github.com/ppwwyyxx/hexo.git
cd hexo
npm install
npm test
- Will this affect rendering results? I found that in nunjucks, users can define variables in one tag and use it in another. If each tag is rendered independently, this might stop working. However, I don't know if this matters at all because this "feature" doesn't seem useful to hexo.
Although Hexo's documentation doesn't mention this behavior, I believe some sites are depending on this (each post has its own context of nunjucks) to work. If the changes have to be made, it will become a breaking change.
Thanks @SukkaW for comments. I noticed another issue also related to nunjucks context, noted at https://github.com/hexojs/hexo/issues/4923#issuecomment-1082601798. So I changed this to draft as more discussions are needed on the proper solution.
I managed to fix the abovementioned problem with a different approach (starting from top-level HTML nodes). Now all tests can pass. @SukkaW can you give it a review? Thanks.
Can maintainers take a look? Thanks!
Can maintainers take a look? Thanks!
The changes themselves look great, but I am still afraid of introducing the change would break existing sites.
Also, introducing an HTML parser to the post rendering might have created too much overhead as well.
coverage: 99.531% (+0.002%) from 99.529% when pulling 01ca1abde92a051ded96a698bfd1bc277bd20452 on ppwwyyxx:master into 7b588e78aae57e756e4d18bcd78e63d9dc7d34cd on hexojs:v7.0.0.
The changes themselves look great, but I am still afraid of introducing the change would break existing sites.
The new approach in this PR seems less likely to break existing behaviors. I'm not familiar with advanced usage of tags, do you have an example of what usage might be broken?
If rendering speed or backward-compatibility is a concern, can we make this a per-post option? This would speed up rendering of posts with a lot of slow tags.
@ppwwyyxx
I pull this PR to my local machine and run hexo server
with test repository. But, it has the following error. Do you know what a problem is?
TypeError: parse5.serializeOuter is not a function
at C:\Users\hexo\hexo\lib\hexo\post.js:435:29
at Array.forEach (<anonymous>)
at C:\Users\hexo\hexo\lib\hexo\post.js:433:22
at tryCatcher (C:\Users\hexo\hexo\node_modules\bluebird\js\release\util.js:16:23)
at Promise._settlePromiseFromHandler (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:547:31)
at Promise._settlePromise (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:604:18)
at Promise._settlePromise0 (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:649:10)
at Promise._settlePromises (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:729:18)
at _drainQueueStep (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:93:12)
at _drainQueue (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:86:9)
at Async._drainQueues (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:102:5)
at Immediate.Async.drainQueues [as _onImmediate] (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:15:14)
at processImmediate (node:internal/timers:466:21)
P.S I can provide a test repository which the error occurs.
@yoshinorin did you install the correct version of parse5 indicated in package.json
? The function does exist: https://parse5.js.org/functions/parse5.serializeOuter.html, but not in older versions.
@ppwwyyxx
did you install the correct version of parse5 indicated in package.json?
Sorry, you're right.
The changes themselves look great, but I am still afraid of introducing the change would break existing sites. Also, introducing an HTML parser to the post rendering might have created too much overhead as well.
@SukkaW
How about merge this PR once & ship with v7.0.0
?
Just now I tried hexo server
& hexo generate
with 1400 posts (3829 files) locally, generation speeds are the same with v6.3.0
. Both versions generate content with in 25 ~ 26s
. (I tried 3 times cold generation)
If we will face a problem after release we just revert it and release a patch version.
We need to benchmark on some real-world Hexo websites to evaluate the impact of generation speed @SukkaW @yoshinorin
@stevenjoezhang
We need to benchmark on some real-world Hexo websites to evaluate the impact of generation speed
As I already wrote in https://github.com/hexojs/hexo/pull/4926#issuecomment-1295191717 I tried it locally with customized https://github.com/LouisBarranqueiro/hexo-theme-tranquilpeak (hexo-theme-tranquilpeak was a theme used in my blog.)
I think performance is enough, but I don't know what @SukkaW is concerned about. As you know @SukkaW is more knowledgeable about hexo than me. @SukkaW may notice some performance issue.
I think performance is enough, but I don't know what @SukkaW is concerned about. As you know @SukkaW is more knowledgeable about hexo than me. @SukkaW may notice some performance issue.
What I am really worried about is the breaking change.
{{ let a = '1' }}
A lazy dog jump over a fox
{{ a }}
This should render `1`
Hexo currently supports this behavior. But after the PR, this will break (as nunjucks blocks would be rendered separately).
Unlike the other breaking changes we decided to introduce, there is no way to migrate this easily. Users would probably have to migrate hundreds of posts and end up deciding not to upgrade at all.
@yoshinorin @stevenjoezhang
Perhaps we could add a new per-site option to the _config.yml
to turn this on or off @ppwwyyxx
Happy to make this optional. Shall it be a per-post option instead of per-site?
Happy to make this optional. Shall it be a per-post option instead of per-site?
I'd prefer to have both: an option defined in _config.yml
, and can opt-in/opt-out per post.
I added a per-post option async_tags
. I found it a bit confusing to add a site-wise option:
- If I call the site-wise option
async_tags
then it's confusing because tags across posts can always be rendered async regardless of this option. This PR is only about tags in a post. - If I call the site-wise option something like
async_tags_within_post
, then it's too long and doesn't match the name of the per-post option.
I'd appreciate any suggestions on naming.
Also, I guess a site-wise option is not that important? In most cases, this option doesn't matter because I think (1) slow tags are rare (2) tags that depend on other tags are rare. So a per-post option might be already sufficient for users.
I wonder if this can be merged!
In the current PR the option is opt-in per-post, which means users would have to enable this option 1000 times to experience that kind of overhead for 1000 posts :) There is no overhead if users don't enable the option.
Since we have to parse HTML to know where tags are, I feel there is not much room to improve. Unless nunjucks itself can support such feature, then we don't have to parsing on hexo's side.
How to test
git clone -b master https://github.com/ppwwyyxx/hexo.git
cd hexo
npm install
npm test