hugo icon indicating copy to clipboard operation
hugo copied to clipboard

Allow configuring Pandoc

Open asankah opened this issue 5 years ago • 18 comments

Add Pandoc configuration options for those situations where --mathjax alone doesn't cut it.

asankah avatar Aug 01 '20 17:08 asankah

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Aug 01 '20 17:08 CLAassistant

Needs some work, but I'd like to get an early review before I spend more time on this.

asankah avatar Aug 01 '20 17:08 asankah

Since pandoc now has citeproc built in, this is the only thing standing in the way of having citations in Hugo, which would be huge for many academic users.

AustenLamacraft avatar Sep 17 '21 08:09 AustenLamacraft

Has this been abandoned by now or will it be merged any time soon and I should I hold of of doing my own patch of hugo to enable mathml?

igor-krawczuk avatar Oct 06 '21 13:10 igor-krawczuk

I'm anticipating for this feature. What is going on in this pull request? Will it be merged or there's something problem? If possible, I have intention to contribute to this feature, if there's something problematic.

MilkClouds avatar Apr 27 '22 15:04 MilkClouds

@MilkClouds I didn't get any feedback on this PR one way or another. I can clean up my local patch and update the PR, but it would be it would be nice to get some signal on whether we are going down the right path.

asankah avatar Apr 27 '22 17:04 asankah

@asankah yea, sorry about this, I'm not using Pandoc so it has not been on top of my priority list. Earlier, I was more concerned about the security aspects of adding more config options to these external programs (e.g. pandic, asciidoc). Now that we have added a security config that blocks pandoc by default, I'm more open to these additions.

Can I ask a general question before I look through these changes in detail: Do any of these options allow passing arbitrary flags directly to the pandoc binary?

bep avatar Apr 27 '22 18:04 bep

@bep Currently it does allow passing arbitrary flags, but in practice I've found it to not be necessary. So I'll remove those.

pandoc users tend to go heavy on custom filters which obviously run under the users' identity and can pretty much do anything. So the allowing custom filters tends to make the security properties fuzzy.

Would it be possible to point me at the threat model we are looking at for Hugo? A rough sketch would be enough.

asankah avatar Apr 27 '22 18:04 asankah

@asankah I have written some prose about this here: https://gohugo.io/about/security-model/

The big problem with shelling out to pandoc and asciidoc etc. is that they from a security standpoint becomes black boxes where it not practical have any opinion on the security aspects beyond using the known features as described by the flags that we pass to it.

My main concern about this would be some phishing attacks via a malicious sites (I sometimes help people by downloading sites and building them on my computer) or modules. The pandoc config would only be available to the site config, which is a good thing.

Since pandoc is now blocked by default, I'm sleeping easier at night, but I'm still a little wary.

And my main concern is these unknown flags (all flags of pandoc is unknown to me), e.g. pandoc --compile-and-run-c-program.

If the "arbitrary flag" thing is not vital, I suggest removing it, else we can discuss it.

bep avatar Apr 27 '22 19:04 bep

@bep Since I do not well know about security, I can't speak with confidence, but I guess there isn't any concerns about merging this PR. Even now I can allow usage of pandoc by appending pandoc to security.allow.exec, like this.

security:
  enableInlineShortcodes: false
  exec:
    allow: ["^dart-sass-embedded$", "^go$", "^npx$", "^postcss$", "^pandoc"]
    osEnv: ["(?i)^(PATH|PATHEXT|APPDATA|TMP|TEMP|TERM)$"]

I think letting anyone who want to use pandoc with arbitrary flag to manually config security config, and alerting them is enough. Is there any specific scenario that arbitary argument may be threating to site? Is it related to malicious theme maker?

MilkClouds avatar Apr 28 '22 06:04 MilkClouds

Is there any specific scenario that arbitary argument may be threating to site? I

This:

pandoc --compile-and-run-c-program

As I said above, I assume pandoc does not have such flag, but then again, I assumed that about log4j as well.

bep avatar Apr 28 '22 06:04 bep

May I suggest making a whitelist for additional flags under security, like currently necessary for pandoc anyway? Or ist the concern a user can be tricked to clone a malicious git repo with a malicious config (including whitelists), run Hugo and so get hacked?

igor-krawczuk avatar Apr 28 '22 07:04 igor-krawczuk

Then, will it be okay if parameter of pandoc is being registed in environment variables? malicious theme or config cann't modify our environment variable, as long as I know.

MilkClouds avatar May 04 '22 08:05 MilkClouds

@bep, things are fairly quiet on these Pandoc related PRs (see also #8911). What is needed still to push these across the finish line?

We rely on the flexibility of Pandoc for much of our content generation, but it doesn't do us much good if we can't configure it the way we would like to.

I believe the biggest security concern would be using Pandoc filters, which call an external script. Maybe Hugo can warn a user that a configuration is using filters and they should only proceed if they trust the source of the scripts?

nealkruis avatar Apr 26 '23 16:04 nealkruis

There may be a "voting with the feet" issue @nealkruis . I imagine many of the people who wished to use pandoc as the markup format in hugo wished to use various academic features, such as citations, mathematical markup etc which are not well-supported natively by hugo markup. Some of the effort that previously went into advocating hugo support for these "academic" pandoc features now goes into implementing missing hugo features in quarto. Quarto extends pandoc and thus includes all the typical academic features as a core feature, plus other ones such as rendering code output inline, including plots, producing PDFs and so on. We may have lost critical mass for pandoc support in hugo (which has been pandoc-resistant) in favour of an alternative which is pro-pandoc (although, of course, missing many hugo features). In particular, I believe there has been migration from the hugo-backed blogdown platform to quarto.

danmackinlay avatar Apr 27 '23 00:04 danmackinlay

@bep, things are fairly quiet on these Pandoc related PRs (see also #8911). What is needed still to push these across the finish line?

We rely on the flexibility of Pandoc for much of our content generation, but it doesn't do us much good if we can't configure it the way we would like to.

I believe the biggest security concern would be using Pandoc filters, which call an external script. Maybe Hugo can warn a user that a configuration is using filters and they should only proceed if they trust the source of the scripts?

This feels like a good solution because it works on an informed consent model.

Unfortunately I am tied to hugo because of institutional reasons, and if you want pandoc, move to quarto does not work for me. ^^'

But the way pandoc is currently implemented in hugo seems to me to be unusable; I've spent a few hours searching the documentation + forums and I have yet to find a way to officially specify even setting the -s flag on html documents, so that maths display correctly. (If I'm being blind and stupid and that is already possible in normal hugo, please let me know.)

tytyvillus avatar May 01 '24 09:05 tytyvillus

@tytyvillus

Pandoc's -s flag is not related to math rendering in any way. We pass the --mathjax flag when rendering Pandoc files, so all you need to do is load MathJax.js on the page. Please open a topic in the forum if you need help.

jmooring avatar May 01 '24 10:05 jmooring

This PR has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help. Please check https://github.com/gohugoio/hugo/blob/master/CONTRIBUTING.md#code-contribution and verify that this code contribution fits with the description. If yes, tell is in a comment. This PR will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

github-actions[bot] avatar May 02 '25 02:05 github-actions[bot]