Add a custom syntax highlight file to support new R 4.0 pipe
This PR closes #2196 and is a better way to solve rstudio/bookdown#1157. It reverts 99df4e7 and #2228 by providing our own syntax definition file. Also this will solve the current issue described in #2282 as it removes the faulty post processor.
With the custom syntax definition file |> and => are now identified as Special operators.
It will be supported for Pandoc 2.0+ only but I figured out that it was not worth keeping the other trick for earlier Pandoc version to work.
I added that in the "Tweak Pandoc args" part in render() so that it will be added to all formats , even those for which highlighting is not supported. It seems Pandoc does not warn or error but just will not use the value.
We have two other option to be more conservative:
* Add the argument in each formats as some others. This would a mechanism like for the Lua filter but the syntax definition file seemed more generic
* Add a if clause to test output_format$pandoc$to against supported format in pandoc
I don't know what is the safest. This change impacts possibly all formats so we should be careful in our testing but probably also the implementation.
Obviously next step is to update upstream KDE repo with this file so that it ends up in a future Pandoc version where we'll be able to remove this addition.
What is your take on this ?
BTW I added comments in r.xml file to identify the added lines.
Ok so only relying in Pandoc for this may not be a good idea at least to cover all versions. This is because Pandoc is buggy - We can see that in the CI results passing depending of the Pandoc versions.
I need check which version works. EDIT: maybe an issue in my file. We'll see next CI run
Following that, I tested on Windows with earlier version too. And... there is a bug on Windows before Pandoc 2.15 (https://github.com/jgm/pandoc/issues/6374 fixed by https://github.com/jgm/pandoc/commit/5a1bd526776a9c75a1f348f42415d15d78969e8b). This is unfortunate.
So not sure this is at the end a good idea to do that - maybe it is better to update upstream and wait for Pandoc to get the file 🤔
Older version of Pandoc requires a specific file it seems https://github.com/jgm/pandoc/issues/5328 It is not needed anymore in Pandoc 2.7 https://pandoc.org/releases.html#pandoc-2.7-2019-03-03 but required before. I am trying to apply the same fix to see if we can support more version
Previous fix seems ok for most versions but it seems like 2.0.0.1 that we test will be buggy anyway because of missing xml files.
So for this to work we need to decide what to do regarding Pandoc versions:
- Either not do this at all and abandon the idea of this PR. We could then keep the currently
- Either do this for Pandoc 2.15 and above so that it works on Windows.
We could keep the current trick in both case for all or older version - but #2282 will need fixing.
Support several Pandoc's version is really not easy - this limits us a lot in what to do 😓
Thought on this @yihui ?
One thing for sure is that we have to submit a merge request to KDE at some point. This has to be fixed upstream anyway. The sooner the better.
yes I am onto that. Still trying to get acquainted on how to PR into KDE repo following there guideline.
I think it's okay to ignore lower versions of Pandoc since this syntax highlighting issue is mostly a cosmetic issue
I think to. I was going to only support for Pandoc 2.15 and later - 2.17 will possibly be the next version bundle in the IDE.
However, current version on the IDE is 2.14 so if we do the above, should we still keep the hack and we have fixed the issue for HTML (the hack / fix for PDF is not currently working due to #2282).
I'll rework the PR based on this, and we'll see how complicated it will still be.
Perhaps we just wait for the IDE to upgrade Pandoc (to be >= 2.15) before we rework this PR?
Yes we can. I think next release is scheduled in February. We could sync the release. We probably have #2289 to deal with also for next IDE version if they ship 2.17 indeed.
I've just cleaned and prepare to support 2.15+. Let's leave this as is for now.
@yihui this becomes more useful now as Quarto has come up with a modified version of markdown to support chunk syntax https://github.com/quarto-dev/quarto-cli/commit/83e587ef546b591106fdaf90a11ad5c248a951d0
I'll try to make this into Pandoc, but not sure if / when. We could add this xml file into rmarkdown also.
---
title:
output:
html_document:
highlight: zenburn
pandoc_args: ["--syntax-definition", "markdown.xml"]
---
```{r setup, include=FALSE}
xfun::download_file("https://raw.githubusercontent.com/quarto-dev/quarto-cli/83e587ef546b591106fdaf90a11ad5c248a951d0/src/resources/pandoc/syntax-definitions/markdown.xml")
```
This is an example highlighted correctly
````{verbatim, language = "markdown"}
# Header
This chunk will be **visible**
```{r, echo = TRUE}
mean <- function(x, y) {
(x + y) / 2
}
```
```python
while a < 10:
print(a)
a, b = b, a+b
```
````
Without the file

With the file

So support syntax file addition could be a good idea as of now.
Yes, that looks very nice indeed!
I haven't figure yet what is going on with Windows build but this definitely creates an issue on windows... An indication that this is not a good idea to make this change, or make it simpler ? 🤔
I am looking into that
Seems like tmate debugging windows is no more working, and I can't reproduce reliably on my Windows. So I wonder what could be the issue.
Anyway, I wanted to push the idea of adding bundled syntax file in the package. I wrote things here to be generic and apply to any XML find we would add. At the end, it became kind completed to make it generic (to avoid adding that per individual output format). Not sure if that worth it, as updating upstream could always be the better solution. 🤔
The error "no file found" should be from system.file(mustWork = TRUE). I can take a closer look and see if I can figure it out.
😲 Thanks for 6ad179976555c99df3ab7ae4efbce41fe84eb5e3. I spent time on this without even trying this ....🤦
It is really helpful to have another look sometimes! 😅
PR is ok but I'll still run a few rendering tests of different format before merging.