hexo icon indicating copy to clipboard operation
hexo copied to clipboard

Async rendering of tags within a post

Open ppwwyyxx opened this issue 2 years ago • 21 comments

What does it do?

Fix https://github.com/hexojs/hexo/issues/4923.

Currently, if we use async rendering in custom tag plugins, rendering of tags can run concurrently between posts. However, tags within the same post are still rendered sequentially.

This is because when rendering a post, hexo makes a single render call to nunjucks to render the entire post. Inside nunjucks, it renders all the tags sequentially.

Instead of making one render call per-post, this PR does the following:

  1. split the post into chunks
  2. render each chunk separately, with async call

This way, tags in different chunks will be rendered concurrently. This has accelerated my whole-site generation by 10x because I have some expensive tags.

The following changes in this PR are also necessary:

  • Add a HTML parser, parse5 as a dependency. parse5 is pretty popular and is also the backend of cheerio. Let me know if a different parser is preferred.
  • Modify a test because the output is changed. Note that the new output is equivalent to the old output.

Screenshots

N/A

Pull request tasks

  • [ ] Add test cases for the changes.
  • [ ] Passed the CI test.

Things I'm not sure of:

  1. Should this be made a per-site option, per-post option, or no option at all?
  2. Will this affect rendering results? I found that in nunjucks, users can define variables in one tag and use it in another. If each tag is rendered independently, this might stop working. However, I don't know if this matters at all because this "feature" doesn't seem useful to hexo.

ppwwyyxx avatar Mar 30 '22 03:03 ppwwyyxx

How to test

git clone -b master https://github.com/ppwwyyxx/hexo.git
cd hexo
npm install
npm test

github-actions[bot] avatar Mar 30 '22 03:03 github-actions[bot]

  1. Will this affect rendering results? I found that in nunjucks, users can define variables in one tag and use it in another. If each tag is rendered independently, this might stop working. However, I don't know if this matters at all because this "feature" doesn't seem useful to hexo.

Although Hexo's documentation doesn't mention this behavior, I believe some sites are depending on this (each post has its own context of nunjucks) to work. If the changes have to be made, it will become a breaking change.

SukkaW avatar Mar 30 '22 07:03 SukkaW

Thanks @SukkaW for comments. I noticed another issue also related to nunjucks context, noted at https://github.com/hexojs/hexo/issues/4923#issuecomment-1082601798. So I changed this to draft as more discussions are needed on the proper solution.

ppwwyyxx avatar Mar 30 '22 09:03 ppwwyyxx

I managed to fix the abovementioned problem with a different approach (starting from top-level HTML nodes). Now all tests can pass. @SukkaW can you give it a review? Thanks.

ppwwyyxx avatar May 29 '22 04:05 ppwwyyxx

Can maintainers take a look? Thanks!

ppwwyyxx avatar Aug 26 '22 22:08 ppwwyyxx

Can maintainers take a look? Thanks!

The changes themselves look great, but I am still afraid of introducing the change would break existing sites.

Also, introducing an HTML parser to the post rendering might have created too much overhead as well.

SukkaW avatar Aug 27 '22 16:08 SukkaW

Coverage Status

coverage: 99.531% (+0.002%) from 99.529% when pulling 01ca1abde92a051ded96a698bfd1bc277bd20452 on ppwwyyxx:master into 7b588e78aae57e756e4d18bcd78e63d9dc7d34cd on hexojs:v7.0.0.

coveralls avatar Aug 27 '22 16:08 coveralls

The changes themselves look great, but I am still afraid of introducing the change would break existing sites.

The new approach in this PR seems less likely to break existing behaviors. I'm not familiar with advanced usage of tags, do you have an example of what usage might be broken?

If rendering speed or backward-compatibility is a concern, can we make this a per-post option? This would speed up rendering of posts with a lot of slow tags.

ppwwyyxx avatar Aug 28 '22 00:08 ppwwyyxx

@ppwwyyxx

I pull this PR to my local machine and run hexo server with test repository. But, it has the following error. Do you know what a problem is?

TypeError: parse5.serializeOuter is not a function
    at C:\Users\hexo\hexo\lib\hexo\post.js:435:29
    at Array.forEach (<anonymous>)
    at C:\Users\hexo\hexo\lib\hexo\post.js:433:22
    at tryCatcher (C:\Users\hexo\hexo\node_modules\bluebird\js\release\util.js:16:23)
    at Promise._settlePromiseFromHandler (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:547:31)
    at Promise._settlePromise (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:604:18)
    at Promise._settlePromise0 (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:649:10)
    at Promise._settlePromises (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:729:18)
    at _drainQueueStep (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:93:12)
    at _drainQueue (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:86:9)
    at Async._drainQueues (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:102:5)
    at Immediate.Async.drainQueues [as _onImmediate] (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:15:14)
    at processImmediate (node:internal/timers:466:21)

P.S I can provide a test repository which the error occurs.

yoshinorin avatar Oct 28 '22 15:10 yoshinorin

@yoshinorin did you install the correct version of parse5 indicated in package.json? The function does exist: https://parse5.js.org/functions/parse5.serializeOuter.html, but not in older versions.

ppwwyyxx avatar Oct 28 '22 15:10 ppwwyyxx

@ppwwyyxx

did you install the correct version of parse5 indicated in package.json?

Sorry, you're right.

The changes themselves look great, but I am still afraid of introducing the change would break existing sites. Also, introducing an HTML parser to the post rendering might have created too much overhead as well.

@SukkaW How about merge this PR once & ship with v7.0.0?

Just now I tried hexo server & hexo generate with 1400 posts (3829 files) locally, generation speeds are the same with v6.3.0. Both versions generate content with in 25 ~ 26s. (I tried 3 times cold generation)

If we will face a problem after release we just revert it and release a patch version.

yoshinorin avatar Oct 28 '22 16:10 yoshinorin

We need to benchmark on some real-world Hexo websites to evaluate the impact of generation speed @SukkaW @yoshinorin

stevenjoezhang avatar Nov 25 '22 04:11 stevenjoezhang

@stevenjoezhang

We need to benchmark on some real-world Hexo websites to evaluate the impact of generation speed

As I already wrote in https://github.com/hexojs/hexo/pull/4926#issuecomment-1295191717 I tried it locally with customized https://github.com/LouisBarranqueiro/hexo-theme-tranquilpeak (hexo-theme-tranquilpeak was a theme used in my blog.)

I think performance is enough, but I don't know what @SukkaW is concerned about. As you know @SukkaW is more knowledgeable about hexo than me. @SukkaW may notice some performance issue.

yoshinorin avatar Nov 25 '22 09:11 yoshinorin

I think performance is enough, but I don't know what @SukkaW is concerned about. As you know @SukkaW is more knowledgeable about hexo than me. @SukkaW may notice some performance issue.

What I am really worried about is the breaking change.

{{ let a = '1' }}

A lazy dog jump over a fox

{{ a }}

This should render `1`

Hexo currently supports this behavior. But after the PR, this will break (as nunjucks blocks would be rendered separately).

Unlike the other breaking changes we decided to introduce, there is no way to migrate this easily. Users would probably have to migrate hundreds of posts and end up deciding not to upgrade at all.

@yoshinorin @stevenjoezhang

SukkaW avatar Nov 25 '22 09:11 SukkaW

Perhaps we could add a new per-site option to the _config.yml to turn this on or off @ppwwyyxx

stevenjoezhang avatar Nov 25 '22 09:11 stevenjoezhang

Happy to make this optional. Shall it be a per-post option instead of per-site?

ppwwyyxx avatar Nov 25 '22 11:11 ppwwyyxx

Happy to make this optional. Shall it be a per-post option instead of per-site?

I'd prefer to have both: an option defined in _config.yml, and can opt-in/opt-out per post.

SukkaW avatar Nov 25 '22 11:11 SukkaW

I added a per-post option async_tags. I found it a bit confusing to add a site-wise option:

  • If I call the site-wise option async_tags then it's confusing because tags across posts can always be rendered async regardless of this option. This PR is only about tags in a post.
  • If I call the site-wise option something like async_tags_within_post, then it's too long and doesn't match the name of the per-post option.

I'd appreciate any suggestions on naming.

Also, I guess a site-wise option is not that important? In most cases, this option doesn't matter because I think (1) slow tags are rare (2) tags that depend on other tags are rare. So a per-post option might be already sufficient for users.

ppwwyyxx avatar Nov 29 '22 07:11 ppwwyyxx

I wonder if this can be merged!

ppwwyyxx avatar Apr 06 '23 09:04 ppwwyyxx

In the current PR the option is opt-in per-post, which means users would have to enable this option 1000 times to experience that kind of overhead for 1000 posts :) There is no overhead if users don't enable the option.

Since we have to parse HTML to know where tags are, I feel there is not much room to improve. Unless nunjucks itself can support such feature, then we don't have to parsing on hexo's side.

ppwwyyxx avatar Apr 06 '23 10:04 ppwwyyxx

How to test

git clone -b master https://github.com/ppwwyyxx/hexo.git
cd hexo
npm install
npm test

github-actions[bot] avatar Jan 12 '24 07:01 github-actions[bot]