hugo icon indicating copy to clipboard operation
hugo copied to clipboard

Add segments config + --renderSegments flag

Open bep opened this issue 2 years ago • 5 comments

I'm doing some ground work now making Hugo much more effective for big data/page sets. One of the nice effects I see is that it takes Hugo much less time getting ready to render. A Hugo build basically looks like this:

  1. Read and index data.
  2. Render pages.

The flexibility of the Hugo API means that the second step always needs a full data set, even in a "partial render" situation. But the "render everything" comes at a cost (writing lots of file to disk etc.), which is especially true when we now may talk about "a million pages" and more.

There are a couple of situations where rendering only one or more segments of a site might come in handy:

  1. When rendering everything in one go is too heavy for the server (but note that the upcoming Hugo should be much more efficient in this department).
  2. When building at intervals and you have a segmented site with very different change frequency (e.g. "render the home page and the news section only").

You could argue that Hugo should be smart about "what's changed" and render just that (which is what we're doing in server mode), and you're right, but even then it's much simpler to determine which segments have changed.

I thought we could introduce a new concept of named segments in the Site config (which we may also consider using for other things):

[segments]
[segments.hot]
[[segments.hot.matchers]]
# Need to also render the home page etc.
kind = '{home,term,taxonomy}'
[[segments.hot.matchers]]
path = '/news/**'
[segments.docs]
[[segments.docs.matchers]]
path = '/docs/**'
[segments.old]
[[segments.old.matchers]]
path = '/blog/**'

And then when building:

hugo --renderSegments hot,docs

bep avatar Jul 19 '22 07:07 bep

@bep I love this idea! I currently have to wait about 1.5mins for hugo server to run (and it crashes often, so I have to keep re-running it), and would love to specify a segment so that the build is a lot quicker.

gbmhunter avatar Jul 29 '22 00:07 gbmhunter

I like the general idea of building only parts of a site!

But in a most extreme case, if I only want to rebuild one specific arbitrary page of a project, it means I'ld have to define a segment for each page in the config.

Would be great to have something more "on the fly" like:

hugo --renderSegments path=/news/**,path=/docs/**

McShelby avatar Aug 12 '22 13:08 McShelby

@McShelby yea, we can certainly do that ... but if we get "named segments" working, it should be straight forward to also do "ad hoc rendering".

bep avatar Aug 12 '22 13:08 bep

In addition to this I'd find it useful to have

  • The new renderSegments work with languages too. We would love to skip rendering a whole site in another language in controlled scenarios. HUGO_DISABLELANGUAGES partially works in that regard but isn't quite its purpose. You end up loosing the context/knowledge of the disabled languages which can change the render output depending on your logic.
  • The ability to give an explicit list of files to render from. The use case would be for a CI build being able to pass the git diff list of files that actually changed and hugo render and later deploy only those output files.

Something like hugo --renderGitDiff 9b8eda3...0e87bfd could work by emulating git diff. However we quite often clone a repo with almost no history to speed up the build time so passing an explicit list or file we generate elsewhere would still be valuable.

davidejones avatar Sep 14 '22 10:09 davidejones

The new renderSegments work with languages too.

I agree. We kind of already have that implicit in the first comment, I think, as it uses the same "path matching" used elsewhere in Hugo, e.g{

[segments]
[segments.exotic_languages]
[[segments.exotic_languages.matchers]]
language = '{jp,nb}'

Etc.

bep avatar Sep 14 '22 14:09 bep

Hi @bep,

I noticed that the milestone for the --renderSegments flag has been shifted multiple times. Is there a new estimated release date for this feature? It would be really beneficial for my projects and I am keen to know when it might be available.

Thanks!

tanshuai avatar Sep 28 '23 03:09 tanshuai

How would data persist to the incremental build.

Take a complete build, then a incremental build of a named segment.

Given a file is added to a segment, how does the hot pages get to know to be re-rendered? Is is that just something to be abstracted above named segments.

Always rebuild hot pages. Meaning all data will have to be traversed for a changed segment only when a file has been added or removed. Or link changes.. idk

Does a file have a dependency graph?

naturedamends avatar Oct 06 '23 08:10 naturedamends

2 years later and I still think this is a great idea:

To recap with a slightly adjusted config setup:

[segments]
[segments.hot]
[[segments.hot.includes]]
kind = '{home,term,taxonomy}'
[[segments.hot.includes]]
path = '{/news,/news/**}'
[[segments.hot.excludes]]
lang = '{jp,no,sv}'
[[segments.hot.excludes]]
output = '{rss,xml}'

The filter attributes, all Glob patterns, are:

  • path: This value: https://gohugo.io/methods/page/path/
  • kind
  • lang
  • output (output format, e.g. html).

bep avatar Mar 04 '24 09:03 bep

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

github-actions[bot] avatar Apr 07 '24 01:04 github-actions[bot]