mdBook
mdBook copied to clipboard
Opt-in Interface for Preprocessors to Preprocess SUMMARY.md
Problem
The current interface for mdbook preprocessors is that the preprocessor accepts a serialized (PreprocessorContext, Book) pair over stdin, and it returns a new Book by writing to stdout. In order to produce the context/book pair, the main mdbook process needs to read and process SUMMARY.md, which means that preprocessors can't easily be applied to SUMMARY.md, which limits some potentially interesting use-cases for preprocessors.
As a concrete example, in mdbook-minijinja I've written a preprocessor that evaluates book contents as minijinja templates. One of the use-cases I'm interested in for this preprocessor is the ability to conditionally enable or disable certain chapters based on runtime information (e.g., to render different versions of my book for different audiences).
I'm currently able to support a limited amount of SUMMARY.md preprocessing by loading the summary myself, evaluating it as a template, and then re-parsing it using parse_summary and doing a crude diff between the resulting summary and what I get over stdin. This works surprisingly well for simple use-cases like conditionally-enabling and disabling chapters, but it's brittle and doesn't match the mental model the user would likely want to have (for example, this method doesn't work if I want template evaluation to add new chapters that aren't statically in the SUMMARY.md source).
Proposed Solution
I think the ideal solution for my use-case would be for there to be an interface by which my preprocessor could opt in to being invoked on SUMMARY.md prior to the summary being used to construct the rest of the book content. This would essentially add a new "phase" to the mdbook loading scheme. Currently, my rough mental model of how mdbook works is:
- mdbook starts
- mdbook loads configuration from
book.toml, including information about configured preprocessors and renderers. - mdbook loads
SUMMARY.md - mdbook parses
SUMMARY.md - mdbook loads book content based on parsed summary.md.
- mdbook feeds book content through all preprocessors.
- mdbook passes fully preprocessed book content to renderers.
What I would propose would be to introduce a new step between (3) and (4), in which mdbook would feed the loaded-but-not-yet-parsed content of SUMMARY.md to a new subcommand preprocess-summary command of the preprocessor (similar to the existing supports subcommand today).
Notes
I suggested above that we add a new preprocess-summary subcommand. For backwards-compatibility reasons, this command would probably want to be opt-in: the existing templates for preprocessors seem like they generally check if they're being invoked via the supports subcommand, and otherwise assume they're being invoked for the main preprocessing command. Invoking a new preprocess-summary subcommand would likely break existing preprocessors. I think a straightforward workaround for this would be to make participation in the new method opt-in via a flag in the preprocessor's book.toml section, e.g.:
[preprocessor.minijinja]
preprocess-summary = true
mdbook would only invoke the preprocess-summary command for preprocessors that opt-in by instructing their users to set this flag. You could also spell this as something like a top-level summary-preprocessors list or something like that.
I'd be happy to work on a more concrete proposal/interface for this feature if there's interest in this general direction.
Sorry, I'm not quite following the request here. Preprocessors can add/remove chapters. The structure of SUMMARY.md is essentially the structure of the Book. Is it possible for your preprocessor to use that information to add/remove chapters?
The essential problem is the same as what's outlined by @Michael-F-Bryan in this comment: https://github.com/rust-lang/mdBook/issues/760#issuecomment-765065453
This probably isn't possible. Preprocessors are only run on the chapters of a book once they've been loaded into memory, by which time the SUMMARY.md has already been parsed (it's how we find the chapters to give the preprocessor, after all).
My goal with this proposal is to allow SUMMARY.md itself to be an arbitrary minijinja template. For example, I'd like to be able to write something like:
# My Title
{% if enable_section_1 %}
{% include "section1_summary.md" %}
{% endif %}
...
where section1_summary.md might be a file in a templates directory like:
## Section 1
- [Chapter 1](./section1_chapter1.md).
- [Chapter 2](./section2_chapter2.md).
I don't think there's a good way to do this with the current preprocessor API, because by the time my preprocessor receives any input from mdbook, mdbook has already loaded and parsed the summary, which will likely fail (or at least produce unhelpful results).
The proposal here is to add a step that allows preprocessors to receive the unparsed content of SUMMARY.md, apply transformations to it, and return the result over stdout. mdbook would only attempt to parse SUMMARY.md and load chapter contents after this initial "summary preprocessing" step.
I've made some updates to mdbook-minijinja that solves some of the problems here. I've made it so that my preprocessor reloads SUMMARY.md, evaluates it as a template, and then re-parses the summary and reloads the book using MDBook::load_with_config_and_summary.
The main downside of this approach is that reloading the book contents discards the output of any preprocessors that came before mine, so I'm currently making summary preprocessing optional and documenting that people should configure mdbook-minijinja to run as the first preprocessor if they enable summary preprocessing. I expect that this should be good enough for the majority of use-cases I'm aiming to support, but it does seem a bit unfortunate that the workaround breaks composability with other preprocessors.
would be nice to see something like this (or a different approach) that allows globbing directories in SUMMARY.md