panflute
panflute copied to clipboard
replace_keyword skips over cites
I want to replace keywords everywhere they occur. See my example:
# This is the heading {#sec:alpha}
This is ref 1 = @sec:alpha
This is ref 2 = {@sec:alpha}
This is ref 3 = [@sec:alpha]
This is ref 4 = [{@sec:alpha}]
Then I run 'pandoc test.md -o test.pdf --filter ../filters/headxref.py'. In brief, this finds all those tags, associates them with the header title, and runs "replace_keyword" on the document, replacing, e.g., "@sec:alpha", with "This is the heading".
This yields the following in the output:

So why is it skipping Cite blocks?
With .replace_keyword()
, Panflute walks over all elements and replaces the Str()
elements where .text
exactly matches your input. This means that:
- If you have a text "abcde", then replacing "bcd" will not change anythng.
- Other attributes of the element are not replaced (such as the
url
attribute ofLink()
objects, or in your case, the.id
attribute ofCitation()
.
For instance, in ref1, @sec:alpha
is interpreted by pandoc as a citation object:
[Cite
[Citation
{citationId = "sec:alpha",
citationPrefix = [],
citationSuffix = [],
citationMode = AuthorInText,
citationNoteNum = 1,
citationHash = 0}]
[Str "@sec:alpha"]
]
And the filter modifies the contents of the Str
object (but not the citationId!)
Now, in ref4:
Cite
[Citation
{citationId = "sec:alpha",
citationPrefix = [Str "{"], citationSuffix = [Str "}"],
citationMode = NormalCitation,
citationNoteNum = 1,
citationHash = 0}]
[Str "[{@sec:alpha}]"]
]
You see that the Str object is actually equal to "[{@sec:alpha}]", so nothing changes.
Extending the replace_keyword() function to match substrings is not that difficult though, and it would involve changing just two lines:
https://github.com/sergiocorreia/panflute/blob/43582ccbf53bb2fc370ffd471080c5c34f28fd22/panflute/tools.py#L465 https://github.com/sergiocorreia/panflute/blob/43582ccbf53bb2fc370ffd471080c5c34f28fd22/panflute/tools.py#L473
If there is demand, we can allow partial matches, or even better, maybe regexes? (but that of course will be slow on large documents)
Follow up question: It seems that replacing the Str text would still leave the citation object, which would be seen by a crossref or citation filter, e.g., citeproc or crossref. Is there a trivial way to replace the whole citation with a Str?
Not with .replace_keyword()
, but you can set up a filter that looks for Cite
elements and then replaces the element as needed if its contents match the keyword. A bit more cumbersome of course.
Longer term, it might be useful to have a more powerful replace_keyword, if there is enough demand for it.