pandoc icon indicating copy to clipboard operation
pandoc copied to clipboard

Org writer: Add new org langs

Open hwiorn opened this issue 3 years ago • 3 comments

Pandoc can't handle recent languages such as rust, go and etc. So I added some recent languages.

  • Ref: #5440
  • Ref: https://github.com/jgm/pandoc/commit/1487ee01fd1209217de4d21e6c459ca8a1bcea36
  • Ref: https://orgmode.org/worg/org-contrib/babel/languages/index.html
    • Other programming languages which were not listed in the page can be enabled by installing ob-* modules(e.g: ob-racket, ob-rust, ob-kotlin).

hwiorn avatar Sep 05 '22 02:09 hwiorn

The intent here is to only produce language identifiers that are actually recognized by Org-mode. There is a list here https://orgmode.org/worg/org-contrib/babel/languages/index.html but you have added languages not on that list. What was the principle you used to construct your list? [Do they all come from ob-* modules?]

jgm avatar Sep 05 '22 02:09 jgm

@jgm org-contrib + ob-* modules + possible new languages (e.g: swift, terra)

Theoretically, org-babel-languages can be constructed by code and It can implement easily by user. It means other languages that are not listed can be org-babel source code blocks. On the other hand, pandoc needs to compile again. So I was trying to add new possible languages.

(org-babel-do-load-languages
 'org-babel-load-languages
 '((R . t)
   (kotlin . t)
   (go . t)
   (rust . t)))

I thought I need this PR for this to convert notes between md and org. But #8279 feature seems to be a right solution.

hwiorn avatar Sep 05 '22 04:09 hwiorn

Actually I'd rather not add a command-line option. So maybe this is better than the linked issue.

Ideally pandoc would just do the right thing automatically. That's what we sought to do with the existing code (though the list of languages may be outdated). Note that, in addition to specifying a subset of allowed languages, we convert from the canonical pandoc language specifiers to the ones used by org (pandocLangToOrg); e.g. pandoc allows c and org wants C, org wants lisp instead of commonlisp, etc.

I'm open to broadening the list of languages in the way this PR does, but maybe not to "possible new languages." I'd rather stick with what is supported.

jgm avatar Sep 05 '22 06:09 jgm

Ideally pandoc would just do the right thing automatically. That's what we sought to do with the existing code (though the list of languages may be outdated). Note that, in addition to specifying a subset of allowed languages, we convert from the canonical pandoc language specifiers to the ones used by org (pandocLangToOrg); e.g. pandoc allows c and org wants C, org wants lisp instead of commonlisp, etc.

I like it :)

I'm open to broadening the list of languages in the way this PR does, but maybe not to "possible new languages." I'd rather stick with what is supported.

I agree this. I realized adding "possible new languages" is wrong approach.

I noticed that Emacs doesn't care org-babel-load-languages to fontify an org-source block. https://orgmode.org/manual/Languages.html

By default, only Emacs Lisp is enabled for evaluation. To enable or disable other languages, customize the org-babel-load-languages variable either through the Emacs customization interface, or by adding code to the init file as shown next.

In this example, evaluation is disabled for Emacs Lisp, and enabled for R.

(org-babel-do-load-languages 'org-babel-load-languages '((emacs-lisp . nil) (R . t)))

The org-babel-load-languages is for evaluation, not fontifying. Emacs always tries to use language of source-block header even if that language mode didn't be installed. For example, There is no org-babel mode for markup languages such as Markdown, HTML, YAML, ToML, dockerfile and CMake. But Emacs can edit and fontify it. But Pandoc ignores the languages and replace it to as an example code when converting Markdown to an Org.

https://orgmode.org/manual/Editing-Source-Code.html

org-src-lang-modes If an Emacs major-mode named <LANG>-mode exists, where <LANG> is the language identifier from code block’s header line, then the edit buffer uses that major mode. Use this variable to arbitrarily map language identifiers to major modes.

The problem is, If the original document has unsupported language examples by pandoc, it will be loss their original code form in converted documents.

So I thought Pandoc should have an option for it because Pandoc is not a document evaluator. Or is there any chance to change Pandoc behavior not to ignore language?

Markdown to Org
```
...
```
#BEGIN_EXAMPLE
...
#END_EXAMPLE
```c
...
```
#BEGIN_SRC C
...
#END_SRC
```commonlisp
...
```
#BEGIN_SRC lisp
...
#END_SRC
```any-lang
...
```
#BEGIN_SRC any-lang
...
#END_SRC

hwiorn avatar Sep 23 '22 02:09 hwiorn

We could change things so that the first class is always treated as a language name, even if it's not in the table.

jgm avatar Sep 23 '22 02:09 jgm

Thank you for the response and your work :)

hwiorn avatar Sep 23 '22 03:09 hwiorn