panflute
panflute copied to clipboard
Pre and Post Filters on Autofilter functionality
Hi guys, currently I'm running several filters and I had the need to create my own metadata value named panflute-filters-pre
, what I do with that variable is to receive a list of filters that and prepend it to the panflute-filters
variable, currently it runs with another filter that does that, but I was thinking that we can introduce this in to panflute's functionality.
Having panflute-filters-pre
and panflute-filters-post
.
If you give me a green light on this, I can write it and submit the PR.
I might have missed something, but isn't it already doable as is?
- filters are run in order, i.e. the 1st filter in the list runs first, effectively the "pre"-filter
- filter can mutate the
doc
object, effectively allows you to modifypanflute-filters
metadata
Did it not work if you try to solve the problem like this? If so, may be the auto-filter function is "copying" the panflute-filters
list and doesn't respect the update? May be the solution is to make it does, not to create 2 more keys?
The way that I'm working with this right now is by having a pandoc
command as pandoc -F pre-panflute.py -F panflute
, what the pre-panflute.py
does is to run the following function.
def finalize(doc):
filters = doc.get_metadata("panflute-filters", [])
prefilters = doc.get_metadata("panflute-filters-pre", [])
postfilters = doc.get_metadata("panflute-filters-post", [])
filters = [*prefilters, *filters, *postfilters]
doc.metadata["panflute-filters"] = filters
This allows for panflute
(that gets executed as the second filter), to have available a list of filters to autorun as normally, but allowing me to have beforehand an order for the filters.
Why am I doing this.
I'm using pandoc
and panflute
as my static site builders, (with a couple of makefile), and I have some filters (that run at the end) that apply to all, and some that apply to specific pages, so the approach I'm taking right now, is to configure the global filters at a metadata.yaml
file, putting what I need at the panflute-filters-post
or pre
, and leaving the panflute-filters
variable definition in the document to include what ever filters I need for that particular document.
But still if you had a different case, like a filter that finds and includes local file references (like lists of links and such), you could not worry about does, and leave the link normalizing step to a post filter, for instance inserting local references to markdown
files and when output is html
having a filter that runs post, and converting the links to .html
instead of .md
.
The main reason I'm proposing this is that I don't see it hurting, and it's just an additive change.
My first question would be if it works to put pre-panflute.py as the first (and only) panflute-filters?
And the reaction is that it seems it shouldn’t be in panflute by default. Just like pandoc that the default metadata/args options aren’t the most general but is specific to single document generation in mind.
But funny you should mention about static site generator as I was just thinking about in the last 2 hours if I should build a static site generator centering around pandoc and panflute. I have been using a few site generators including a custom make file for simple cases but I find them lacking and not “native-pandoc” enough for me. If I write one it would be a Python solution with some other dependencies including some way to have a make-like dependency generating capability and automatic parallelization (basically the advantage of my make workflow but addressing some limitations there and borrow some concepts from other site generators.)
No, because once the autofilter starts running, you have a defined queue of filters that is already in place, changing the value of panflute-filter variable would not take any effect on the filter queue that we have currently running.
I'm currently polishing my project, I might upload it to GitHub during the weekend.
Right, that was what I meant in the beginning: shouldn't we change this behavior, that panflute-filters
can be mutated, rather than adding 2 more keys?
Ok, It sounds doable, but we would need to change the way autofilter is doing the filter lookup and filter execution, also, we could make a filter that calls himself infinitely like
filters = doc.get_metadata("panflute-filters", [])
filters = [*filters, "this_filter.py"]
doc.metadata["panflute-filters"] = filters
And I guess this is not desirable. I've looked on how docutils
work, and they do something like what you mentioned.
I think more about it and am now thinking may be the mutating panflute-filters
may not be a good direction. My summary of the situation:
- the problem is about how to specify
panflute-filters
through different ways, and how they should be resolved regarding their order. In your case you have some "local"panflute-filters
list and "global"panflute-filters
list. - your current solution (
pandoc -F pre-panflute.py -F panflute
) requires an additional filter. So, any solution that requires you writing another filter doesn't exactly solve your problem. i.e. even ifpanflute-filters
is dynamically resolved, it doesn't fully solved your problem.- By the way, in your
pre-panflute.py
, if you callautofilter.stdio
directly, you can avoid calling 2 filters in pandoc to avoid converting the ASTs twice. i.e. you don't need to modify the metadata at all if you're writing a filter to perform this.
- By the way, in your
About resolving local and global panflute-filters
list,
- The thing I worry most in adding a solution to panflute directly would be its generality. The most general case is the 2 lists merging in arbitrary insertion order, where yours is a special case of this. (i.e. per list it only specifies what orders they must be executed within each, but there's no information to which one from the local has to be run first, even before the global ones, etc.)
- But may be it is not that important. May be practically the global ones should either be before those in YAML or after. In this case, we are asking what are the reasonable approach to pass global option to pandoc filter (where pandoc doesn't allow passing command line arg), and the most straight forward solution would be env. var. Would adding two env var having names similar to panflute-filters-pre and panflute-filters-post satisfies you?
By the way, just so I am not misunderstanding your setup. Your metadata.yaml
is prepended to your markdown file so that pandoc is resolving the 2 metadata blocks (one from YAML, one from the block in markdown.) Correct? i.e. pandoc metadata.yaml some-file.md ...
?
The extra metadata files are added the following way
pandoc \
-F panflute-pre.py \
-F panflute \
--metadata-file=metadata_1.yaml \
--metadata-file=metadata_2.yaml \
--metadata input_file="${FILE}" \
-i "${FILE}"
In order to make my static site work, I inject some values calculated by my Makefile
, but in general terms, the idea is that we have several metadata files, one per directory,
This allows to have one global file that has metadata like your Google Analytics ID, and then you can have in your template file the placeholder for it, and that's it, also it allows you to include CSS at a global scale.
This metadata files, get overwritten, from left to right, meaning that if
- ${FILE} has a value for
css
-
metadata_1.yaml
has a value forcss
-
metadata_2.yaml
has a value forcss
Then the value that takes wins is the one on metadata_2.yaml
. Now as you pass the value via, CLI, then the CLI one takes precedence.
In reality, I'm using more than 2 metadata files plus the metadata block on the document itself.
By the way, in your pre-panflute.py, if you call autofilter.stdio directly, you can avoid calling 2 filters in pandoc to avoid converting the ASTs twice. i.e. you don't need to modify the metadata at all if you're writing a filter to perform this.
Good point, didn't think about that.