typst icon indicating copy to clipboard operation
typst copied to clipboard

Add list of "supported languages to syntax-highlight raw text" in documentation

Open Andrew15-5 opened this issue 1 year ago • 14 comments

In the documentation of text(), there is only note about Typst language. In the raw element, it mentions Rust and Typst languages. In the read(), HTML is used as an example. But I couldn't find a list of all supported languages that can be "syntax-highlighted". It's a very good idea to add this in the documentation. It should be like:

Table of supported languages that can be syntax-highlighted in the raw text

Language Argument values Notes
HTML "html"
LaTeX/TeX "latex", "tex"
Rust "rust"
TypeScript "typescript", "ts"
TypeScript React "typescriptreact", "tsx" It is an extended version of TypeScript and therefore can be used for non-React TypeScript code
Typst code "typc"
Typst markup "typ"

Notes:

  • I don't understand why there are 2 different types of "Typst" language (it almost certainly must be addressed in the documentation where they are mentioned);
  • if the languages have the exact same rules and exact same theme style than they should go under the same row (LaTeX/TeX, although I'm not 100% sure if they are aliases to the same thing);
  • put them in alphabetical order (of course);
  • add syntax highlighting to the strings which are put under the "Argument values" column;
  • in the "Argument values" column, sort acceptable values from longest to shortest and then in alphabetical order;
  • the "Notes" column should contain helpful notes if needed;
  • the "Notes" column is very good for readability and accessibility, but it could be present in a form of "hover over" information;
  • I want the documentation to also mention a nifty thing like file_path.split(".").at(-1) which can be used (I hope in all cases) as an argument for the lang parameter. It simply extracts a file extension, which can be used as a language identifier.

Was mentioned in https://forum.typst.app/t/is-there-a-prism-language-definition-for-typst/914/2.

Andrew15-5 avatar Jun 16 '23 16:06 Andrew15-5

I don't understand why there are 2 different types of "Typst" language (it almost certainly must be addressed in the documentation where they are mentioned);

As the name suggests, it's to distinguish between markup and code in Typst. For example, consider the following code snippet: Hi this is _a_ test. This is in *bold*.

Screenshot 2023-06-16 at 23 49 03

In the first case, it will be interpreted as Typst code which is why the in is highlighted so weirdly (because it's a key word), while in the second case, it will really be interpreted as markup.

LaurenzV avatar Jun 16 '23 21:06 LaurenzV

But Typst file always contains both! (I'm just a beginner, but I'm pretty sure about this, at least this is applicable to the document files and not the libraries and such.) How do you propose to deal with that? Which lang should I use?

It's probably will be more difficult to combine the two, but it still should be doable.

Andrew15-5 avatar Jun 16 '23 21:06 Andrew15-5

Which lang should I use?

When editing a Typst file, there are three different modes you can be in:

  • Markup mode is the default,
  • Code mode lets you define variables, functions and call them,
  • Math mode lets you write math equations.

You can change mode at any point using the following table:

From To How?
Markup / Math Code Prefix with #
Markup Math Surround with $
Code Markup Brackets ([markup])

The two Typst "languages" simply correspond to different modes: "typ" is the default mode (i.e., markup), while "typc" puts you in code mode by default. Of course, you can always switch mode within those "languages" using the table above.

Basically, "typ" means "this is Typst markup (which may contain code within it, but we will be able to tell, and highlight it accordingly, because it will be prefixed with #)," and "typc" means "this is Typst code (which may contain markup within it, but we will be able to tell, and highlight it accordingly, because it will be bracketed)."

Most of the time, you likely want to use "typ". The only time you want to use "typc" is if you want to be in code mode by default.

TL;DR. Use "typ".

MDLC01 avatar Jun 26 '23 11:06 MDLC01

Great answer, thanks!

But this was enough for me:

Most of the time, you likely want to use "typ". The only time you want to use "typc" is if you want to be in code mode by default.

Andrew15-5 avatar Jun 26 '23 11:06 Andrew15-5

The original question has not been answered, namely which languages are supported by raw(), which is something I have wondering for a while too.

ondohotola avatar Nov 22 '23 07:11 ondohotola

Here's a list extracted from the web app's autocompletion. We could probably autogenerate this list in the docs via the compiler's Reflect mechanism. The main reason lang currently is a plain string instead of a more specific type is that arbitrary languages are allowed, just not recognized unless you handle them yourself in a show rule. But we could also list these literals and string in the Reflect impl.

List of languages
ActionScript: as or actionscript
Ada: adb, ads, gpr,  or  ada
Apache Conf: envvars, htaccess, HTACCESS, htgroups, HTGROUPS, htpasswd,  or  HTPASSWD
AppleScript: applescript
AsciiDoc (Asciidoctor): adoc, ad,  or  asciidoc
ASP: asa  or  asp
Assembly (x86_64): yasm, nasm, asm, inc,  or  mac
Authorized Keys: authorized_keys, pub,  or  authorized_keys2
AWK: awk
Batch File: bat  or  cmd
BibTeX: bib  or  bibtex
Bourne Again Shell (bash): sh, bash, zsh,  or  fish
C: c  or  h
C#: cs  or  csx
C++: cpp, cc, cp, cxx, C, h, hh, hpp, hxx, inl,  or  ipp
Cabal: cabal
camlp4: camlp4
Clojure: clj  or  clojure
CMake: cmake
CMakeCache: cmakecache
CMakeCommands: cmakecommands
CoffeeScript: coffee, Cakefile, cson,  or  coffeescript
Comma Separated Values: csv  or  tsv
commands-builtin-shell-bash: commands-builtin-shell-bash
CpuInfo: cpuinfo
Crontab: tab  or  crontab
Crystal: cr  or  crystal
CSS: css
D: d  or  di
Dart: dart
Diff: diff  or  patch
Dockerfile: Dockerfile  or  dockerfile
DotENV: env  or  dotenv
Elixir: ex, exs,  or  elixir
Elm: elm
Email: eml, msg, mbx, mboxz,  or  email
Erlang: erl, hrl, Emakefile, emakefile,  or  erlang
F#: fs, fsi,  or  fsx
Fish: fish
Fortran (Fixed Form): f, F, f77, F77, for, FOR, fpp,  or  FPP
Fortran (Modern): f90, F90, f95, F95, f03, F03, f08,  or  F08
Fortran Namelist: namelist
fstab: fstab, crypttab,  or  mtab
GLSL: vs, fs, gs, vsh, fsh, gsh, vshader, fshader, gshader, vert, frag, geom, tesc, tese, comp, glsl, mesh, task, rgen, rint, rahit, rchit, rmiss,  or  rcall
gnuplot: gp, gpl, gnuplot, gnu, plot,  or  plt
Go: go
GraphQL: graphql, graphqls,  or  gql
Graphviz (DOT): dot, DOT,  or  gv
Groovy: groovy, gvy,  or  gradle
group: group
Haskell: hs  or  haskell
Highlight non-printables: show-nonprintable
hosts: hosts
HTML: html, htm, shtml, xhtml, inc, tmpl,  or  tpl
HTML (ASP): asp
HTML (Erlang): yaws
HTML (Rails): rails, rhtml,  or  erb
HTML (Tcl): adp
HTML (Twig): twig
HTTP Request and Response: http
INI: ini, INI, inf, INF, reg, REG, lng, cfg, CFG, desktop, url, URL,  or  hgrc
Java: java  or  bsh
Java Properties: properties
Java Server Page (JSP): jsp
JavaDoc: javadoc
JavaScript: js, htc,  or  javascript
Jinja2: j2, jinja2,  or  jinja
JQ: jq
JSON: json, sublime-settings, sublime-menu, sublime-keymap, sublime-mousemap, sublime-theme, sublime-build, sublime-project, sublime-completions, sublime-commands, sublime-macro,  or  sublime-color-scheme
JSON (Terraform): tfstate
jsonnet: jsonnet, libsonnet,  or  libjsonnet
Julia: jl  or  julia
Known Hosts: known_hosts
Kotlin: kt, kts,  or  kotlin
LaTeX: tex, ltx,  or  latex
Lean: lean
Less: less
Lisp: lisp, cl, clisp, l, mud, el, scm, ss, lsp,  or  fasl
Literate Haskell: lhs
LLVM: ll  or  llvm
log: log
Lua: lua
Makefile: make, GNUmakefile, makefile, Makefile, OCamlMakefile, mak,  or  mk
Man Page (groff/troff): man, groff,  or  troff
Manpage: man  or  manpage
Markdown: md, mdown, markdown,  or  markdn
MATLAB: matlab
MediawikerPanel: mediawikerpanel
Mediawiki NG: mediawiki, wikipedia,  or  wiki
MemInfo: meminfo
MultiMarkdown: multimarkdown
NAnt Build File: build
nginx: conf, fastcgi_params, scgi_params, uwsgi_params,  or  nginx
Nim: nim, nims,  or  nimble
Ninja: ninja
Nix: nix
Objective-C: m, h,  or  objective-c
Objective-C++: mm, M,  or  h
OCaml: ml, mli,  or  ocaml
OCamllex: mll  or  ocamllex
OCamlyacc: mly  or  ocamlyacc
orgmode: org  or  orgmode
Pascal: pas, p, dpr,  or  pascal
passwd: passwd
Perl: pl, pm, pod, t, PL,  or  perl
PHP: php, php3, php4, php5, php7, phps, phpt,  or  phtml
Plain Text: txt
Protocol Buffer: proto  or  protodevel
Protocol Buffer (TEXT): textpb, pbtxt,  or  prototxt
Puppet: pp, epp,  or  puppet
PureScript: purs  or  purescript
Python: py, py3, pyw, pyi, pyx, pxd, pxi, rpy, cpy, SConstruct, Sconstruct, sconstruct, SConscript, gyp, gypi, Snakefile, wscript,  or  python
QML: qml  or  qmlproject
R: R, r, s, S,  or  Rprofile
Racket: rkt  or  racket
Rd (R Documentation): rd
Rego: rego
Regular Expression: re
Requirements.txt: pip
resolv: resolv
reStructuredText: rst, rest,  or  restructuredtext
Robot Framework: robot  or  resource
Ruby: rb, Appfile, Appraisals, Berksfile, Brewfile, capfile, cgi, Cheffile, Deliverfile, Fastfile, fcgi, Gemfile, gemspec, Guardfile, irbrc, jbuilder, podspec, prawn, rabl, rake, Rakefile, Rantfile, rbx, rjs, Scanfile, simplecov, Snapfile, thor, Thorfile, Vagrantfile,  or  ruby
Ruby Haml: haml  or  sass
Ruby on Rails: rxml  or  builder
Ruby Slim: slim  or  skim
Rust: rs  or  rust
Scala: scala  or  sbt
SCSS: scss
Shell-Unix-Generic: shell-unix-generic
SML: sml, cm,  or  sig
Solidity: sol  or  solidity
SQL: sql, ddl,  or  dml
SQL (Rails): erbsql
SSH Config: ssh_config
SSHD Config: sshd_config
Strace: strace
Stylus: styl  or  stylus
Svelte: svlt  or  svelte
Swift: swift
syslog: syslog
SystemVerilog: sv, v, svh, vh,  or  systemverilog
Tcl: tcl
Terraform: tf, tfvars, hcl,  or  terraform
TeX: sty, cls,  or  tex
Textile: textile
TOML: toml, tml,  or  Pipfile
TypeScript: ts, mts, cts,  or  typescript
TypeScriptReact: tsx  or  typescriptreact
Typst: typ  or  typst
Typst (code): typc
varlink: varlink
Verilog: v, V,  or  verilog
VimL: vim, vimrc, gvimrc, _vimrc, _gvimrc,  or  viml
Vue Component: vue
Vyper: vy  or  vyper
XML: xml, xsd, xslt, tld, dtml, rss, opml,  or  svg
YAML: yaml, yml,  or  sublime-syntax
Zig: zig

laurmaedje avatar Nov 22 '23 10:11 laurmaedje

@laurmaedje, I think you should wrap such a long list in

<details><summary>Details</summary>
```
```
</details> 

To keep the comment as short as possible, because otherwise readability of comments greatly decreases when you want to follow up or just scroll through the timeline.

You can also write /d (https://docs.github.com/en/issues/tracking-your-work-with-issues/about-slash-commands).

Andrew15-5 avatar Nov 22 '23 10:11 Andrew15-5

thank you

ondohotola avatar Nov 22 '23 11:11 ondohotola

I assume the list of supported languages is custom (handwritten)? Or perhaps mixed with some existing set of language parsing files. I heard that some apps depend on one or the other "app" or "set of files" to enabled syntax highlighting in the apps. Like https://highlightjs.org/ and other stuff.

Since Typst is still new, you probably won't see syntax highlighting support for it anywhere else. And minted package in LaTeX doesn't fully support React syntax in JS/TS, but Typst does (IIRC).

Andrew15-5 avatar Nov 22 '23 11:11 Andrew15-5

We use syntect with the language data sourced from bat.

laurmaedje avatar Nov 22 '23 13:11 laurmaedje

bat has support for Assembly (ARM). How can I find out what the language type/tag is I need to set in typst to get this highlighting?

Gaweringo avatar Apr 12 '24 21:04 Gaweringo

The easiest way is currently still to use the autocompletion. But I can't find ARM assembly in there. However, you can load the sublime-syntax file dynamically via #set raw(syntaxes: ..).

laurmaedje avatar Apr 12 '24 21:04 laurmaedje

https://typst.app/docs/reference/text/raw/#parameters-syntaxes

Andrew15-5 avatar Apr 12 '24 21:04 Andrew15-5

Thank you. I just found this https://github.com/typst/typst/issues/1565#issuecomment-1605511340 comment on a PR which states that ARM Assembly was excluded because of compatibility issues:

Syntaxes excluded due to compatibility issue:

  • syntaxes/02_Extra/Assembly (ARM).sublime-syntax
  • ...

The 02_Extra/Assembly (ARM).sublime-syntax can not be used directly as there is a problem with the regex for (I think) preprocessor macros. Specifically, the \g<id> is what typst reports as the problem.

 - match: |-
        (?x)
        ^\s*\#\s*(define)\s+             # define
        ((?<id>[a-zA-Z_][a-zA-Z0-9_]*))  # macro name
        (?:                              # and optionally:
            (\()                         # an open parenthesis
                (
                    \s* \g<id> \s*       # first argument
                    ((,) \s* \g<id> \s*)*  # additional arguments
                    (?:\.\.\.)?          # varargs ellipsis?
                )
            (\))                         # a close parenthesis
        )?

Since I don't need that part, I can just delete it and then it works. Thank you for the quick help.

Gaweringo avatar Apr 13 '24 10:04 Gaweringo