markdown icon indicating copy to clipboard operation
markdown copied to clipboard

Enable LuaMetaTeX in ConTeXt tests and examples

Open Witiko opened this issue 9 months ago • 2 comments

This PR enables LuaMetaTeX in ConTeXt tests and examples after #551.

Witiko avatar Feb 20 '25 23:02 Witiko

I wanted to test the new support for ConTeXt LMTX but I am still seeing some issues, namely the lack of support for the KPathSea library in LMTX, even in TeX Live, see the CI failure from this PR:

make FAIL_FAST=true test
make -C tests
make[1]: Entering directory '/__w/markdown/markdown/tests'
find testfiles/ -type f -name '*.test' -exec ./test.sh  {} +

Creating a Python virtual environment in /__w/markdown/markdown/tests/test-virtualenv.
2025-02-21 10:09:14,197 Running tests for 836 testfiles.
2025-02-21 10:09:14,197 Will fail at first error.
Testfile testfiles/regression/github/issue-508-fancy-lists.test:

  Some commands produced non-zero exit codes:
  - Command context [...] --luatex [...] test.tex [...] exited successfully.
  - Command context [...]          [...] test.tex [...] produced exit code 1.
  
  Some commands produced unexpected outputs:
  - Command context [...] --luatex [...] test.tex [...] produced expected output.
  - Command context [...]          [...] test.tex [...] produced unexpected output with the following diff:
  
    [...]

make[1]: *** [Makefile:13: all] Error 1
make[1]: Leaving directory '/__w/markdown/markdown/tests'
make: *** [Makefile:213: test] Error 2

Looking into the file test.log, I see the following:

lua error       > lua error on line 5 in file ./test.tex:

token call, execute: ...live/2024/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:180: module 'kpse' not found:
	no field package.preload['kpse']
	no file '/usr/local/share/lua/5.5/kpse.lua'
	no file '/usr/local/share/lua/5.5/kpse/init.lua'
	no file '/usr/local/lib/lua/5.5/kpse.lua'
	no file '/usr/local/lib/lua/5.5/kpse/init.lua'
	no file './kpse.lua'
	no file './kpse/init.lua'
stack traceback:
	[C]: in upvalue 'requiem'
	...live/2024/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:180: in function <...live/2024/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:165>
	(...tail calls...)
	[ctxlua]:2: in main chunk
 1     % Load the package.
 2     \startluacode
 3     local kpse = require("kpse")
 4     kpse.set_program_name("luatex")
 5 >>  \stopluacode
 6     \usemodule[t][markdown]
 7     
 8     % Load the support files.
 9     \setupmarkdown [
10       eagerCache = false,
11       import = {
12         witiko/markdown/test = snippet as testsnippet,
13       }
14     ]
15

This is perhaps to be expected: Even if KPathSea is available for LuaMetaTeX in TeX Live, we apparently need to use the predefined object optional.kpse instead of require("kpse") according to luametatex.pdf. Regardless, KPathSea is unlikely to be widely available in ConTeXt standalone and therefore, we shouldn't rely on it.

Witiko avatar Feb 21 '25 10:02 Witiko

@andreiborisov: Can you please test that you can compile the example file examples/context-lmtx.tex using LMTX and version 3.11.0 of the Markdown package for TeX? You will need to remove the following part that loads KPathSea:

\startluacode
local kpse = require("kpse")
kpse.set_program_name("luatex")
\stopluacode

Furthermore, you will also need to copy all dependencies to the directory examples/. As discussed in https://github.com/Witiko/markdown/issues/402#issuecomment-2010265500, this includes at least the following files and packages from CTAN:

  1. The expl3-generic.tex plain TeX file from l3kernel
  2. The lua-uni-algos Lua package
  3. The lt3luabridge plain TeX package
  4. The tinyyaml Lua package

Please, let me know how that goes!

If you manage to compile the example document, I will use your feedback to hopefully have the CI running the LMTX tests and compiling the LMTX examples in the next version, to be released by the end of March at the latest. :crossed_fingers:

Witiko avatar Feb 21 '25 10:02 Witiko

@andreiborisov seems unavailable with no GitHub activity since at least January. Therefore, I will jump on this in the meantime and try to get things moving before the end of April. This includes adding examples and tests for LuaMetaTeX and ConTeXt LMTX, writing instructions for installation with ConTeXt standalone, and fixing any issues I find along the way.

Adding a Dockerfile for ConTeXt standalone, continuously building a Docker image, running tests with it, and releasing it sounds great on paper but I am not using ConTeXt nearly enough to feel comfortable as the only maintainer of this part of the Markdown Package for TeX. Therefore, this is currently not a goal.

Witiko avatar Mar 31 '25 12:03 Witiko

@andreiborisov: Can you please test that you can compile the example file examples/context-lmtx.tex using LMTX and version 3.11.0 of the Markdown package for TeX? You will need to remove the following part that loads KPathSea: [...] Furthermore, you will also need to copy all dependencies to the directory examples/. As discussed in https://github.com/Witiko/markdown/issues/402#issuecomment-2010265500, this includes at least the following files and packages from CTAN: [...] Please, let me know how that goes!

@andreiborisov seems unavailable with no GitHub activity since at least January. Therefore, I will jump on this in the meantime and try to get things moving before the end of April.

I removed the part of file context-lmtx.tex that loaded KPathSea and symlinked all dependencies to examples/:

lrwxrwxrwx 1 root    root      68 Apr 15 12:43 expl3-code.tex -> /usr/local/texlive/2025/texmf-dist/tex/latex/l3kernel/expl3-code.tex
lrwxrwxrwx 1 root    root      71 Apr 15 12:42 expl3-generic.tex -> /usr/local/texlive/2025/texmf-dist/tex/latex/l3kernel/expl3-generic.tex
lrwxrwxrwx 1 root    root      63 Apr 15 12:47 expl3.lua -> /usr/local/texlive/2025/texmf-dist/tex/latex/l3kernel/expl3.lua
lrwxrwxrwx 1 root    root      72 Apr 15 12:45 lt3luabridge.sty -> /usr/local/texlive/texmf-local/tex/generic/lt3luabridge/lt3luabridge.sty
lrwxrwxrwx 1 root    root      72 Apr 15 12:45 lt3luabridge.tex -> /usr/local/texlive/texmf-local/tex/generic/lt3luabridge/lt3luabridge.tex
lrwxrwxrwx 1 root    root      77 Apr 15 12:44 lua-uni-algos.lua -> /usr/local/texlive/2025/texmf-dist/tex/luatex/lua-uni-algos/lua-uni-algos.lua
lrwxrwxrwx 1 root    root      76 Apr 15 12:44 lua-uni-case.lua -> /usr/local/texlive/2025/texmf-dist/tex/luatex/lua-uni-algos/lua-uni-case.lua
lrwxrwxrwx 1 root    root      81 Apr 15 12:44 lua-uni-graphemes.lua -> /usr/local/texlive/2025/texmf-dist/tex/luatex/lua-uni-algos/lua-uni-graphemes.lua
lrwxrwxrwx 1 root    root      81 Apr 15 12:44 lua-uni-normalize.lua -> /usr/local/texlive/2025/texmf-dist/tex/luatex/lua-uni-algos/lua-uni-normalize.lua
lrwxrwxrwx 1 root    root      77 Apr 15 12:44 lua-uni-parse.lua -> /usr/local/texlive/2025/texmf-dist/tex/luatex/lua-uni-algos/lua-uni-parse.lua
lrwxrwxrwx 1 root    root      74 Apr 15 12:45 t-lt3luabridge.tex -> /usr/local/texlive/texmf-local/tex/generic/lt3luabridge/t-lt3luabridge.tex
lrwxrwxrwx 1 root    root      64 Apr 15 12:46 tinyyaml.lua -> /usr/local/texlive/texmf-local/scripts/lua-tinyyaml/tinyyaml.lua

Then, I ran context context-lmtx.tex with the following result:

lua error       > lua error on line 155 in file expl3-code.tex:

token call, execute: expl3.lua:77: attempt to call a nil value (upvalue 'set_char')
stack traceback:
	expl3.lua:77: in local 'token_create_safe'
	expl3.lua:86: in main chunk
	[C]: in upvalue 'requiem'
	...live/2025/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:180: in function 'SAVEDREQUIRE'
	(...tail calls...)
	[ctxlua]:1: in main chunk
145       \else
146         \begingroup\expandafter\expandafter\expandafter\endgroup
147         \expandafter\ifx\csname newcatcodetable\endcsname\relax
148           \input{ltluatex}%
149         \fi
150         \begingroup\expandafter\expandafter\expandafter\endgroup
151         \expandafter\ifx\csname newluabytecode\endcsname\relax
152         \else
153           \newluabytecode\@expl@luadata@bytecode
154         \fi
155 >>      \directlua{require("expl3")}%
156         \ifnum 0%
157           \directlua{
158             if status.ini_version then
159               tex.write("1")
160             end
161           }>0 %
162           \everyjob\expandafter{%
163             \the\expandafter\everyjob
164             \csname\detokenize{lua_now:n}\endcsname{require("expl3")}%
165           }%
mtx-context     | fatal error: return code: 1

The line 77 from file expl3.lua reads set_char(s, 0), where set_char is set to token.set_char on line 68 of the same file. Apparently, whereas the function token.set_char was available in LuaTeX, it is now gone in LuaMetaTeX. I did a little digging in the repository contextgarden/luametatex and token.set_char was commented out in commit https://github.com/contextgarden/luametatex/commit/c4ef1cb05567527cb74c792e539c50a2f8e6d86e from LuaMetaTeX 2.11.05 (2024-10-31) for no apparent reason.

I opened a ticket tracking this issue upstream: https://github.com/latex3/latex3/issues/1724. Any further work on this PR is blocked until the upstream ticket has been resolved.

Witiko avatar Apr 15 '25 13:04 Witiko

Continuing https://github.com/Witiko/markdown/pull/557#issuecomment-2805084481 by patching the file expl3.lua and running context context-lmtx.tex produces the following error instead:

loading         > ConTeXt User Module / markdown
open source     > level 3, order 8, name 'markdown/markdown.tex'
open source     > level 4, order 9, name 'lt3luabridge.tex'

tex error       > tex error on line 114 in file lt3luabridge.tex: Missing number, case 7, treated as zero



<to be read again> 
    
    P
<line 7.114> 
      }

104           }
105       }
106       {
107         \cs_generate_variant:Nn
108           \msg_error:nnn
109           { nnV }
110         \msg_error:nnV
111           { luabridge }
112           { unknown-method }
113           \g_luabridge_method_int
114 >>    }
115     \int_compare:nNnT
116       { \g_luabridge_method_int }
117       =
118       { \c_luabridge_method_shell_int }
119       {
120         \sys_if_platform_unix:TF
121           {
122             \str_const:Nn
123               \c_luabridge_default_output_dirname_str
124               { $TEXMF_OUTPUT_DIRECTORY }
A number should have been here; I inserted '0'. (If you can't figure out why I
needed to see a number, look up 'weird error' in the index to The TeXbook.)
mtx-context     | fatal error: return code: 1

This still seems to be an issue with expl3, not with lt3luabridge, as I discuss in https://github.com/latex3/latex3/issues/1702#issuecomment-2807462460.

Witiko avatar Apr 15 '25 20:04 Witiko

After @josephwright's patch from https://github.com/latex3/latex3/commit/885643a0b85d9a0110b558cc45587567c2893b38, running context context-lmtx.tex produces the following error instead:

resolvers       | formats | executing runner 'run luametatex format': /usr/local/texlive/2025/bin/x86_64-linux/luametatex --jobname="./context-lmtx.tex" --socket --shell-escape --fmt=/usr/local/texlive/2025/texmf-var/luametatex-cache/context/a86c089b384a3076dc514ba966a1fac9/formats/luametatex/cont-en.fmt --lua=/usr/local/texlive/2025/texmf-var/luametatex-cache/context/a86c089b384a3076dc514ba966a1fac9/formats/luametatex/cont-en.lui  --c:currentrun=1 --c:fulljobname="./context-lmtx.tex" --c:input="./context-lmtx.tex" --c:kindofrun=1 --c:maxnofruns=9 --c:texmfbinpath="/usr/bin"
system          > 
system          > ConTeXt  ver: 2025.02.28 18:12 LMTX  fmt: 2025.3.30  int: english/english
system          > 
system          > 'cont-new.mkxl' loaded
open source     > level 1, order 1, name '/usr/local/texlive/2025/texmf-dist/tex/context/base/mkxl/cont-new.mkxl'
system          > beware: some patches loaded from cont-new.mkiv
close source    > level 1, order 1, name '/usr/local/texlive/2025/texmf-dist/tex/context/base/mkxl/cont-new.mkxl'
system          > 'cont-sys.mkxl' loaded
open source     > level 1, order 2, name '/usr/local/texlive/2025/texmf-dist/tex/context/texlive/cont-sys.mkxl'
close source    > level 1, order 2, name '/usr/local/texlive/2025/texmf-dist/tex/context/texlive/cont-sys.mkxl'
system          > files > jobname './context-lmtx', input './context-lmtx.tex', result './context-lmtx'
fonts           > latin modern fonts are not preloaded
languages       > language 'en' is active
open source     > level 1, order 3, name './context-lmtx.tex'
fonts           > preloading latin modern fonts (third stage)
fonts           > 'fallback modern rm 10pt' is loaded
modules         > 'markdown' is loaded
open source     > level 2, order 4, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdown.tex'
open source     > level 3, order 5, name 'expl3-generic.tex'
Package: expl3 2025-04-14 L3 programming layer (loader)

open source     > level 4, order 6, name 'expl3-code.tex'
Package: expl3 2025-04-14 L3 programming layer (code)

close source    > level 4, order 6, name 'expl3-code.tex'
backend         > calling unavailable pdf.getcreationdate function
open source     > level 4, order 7, name 'l3backend-dvips.def'
File: l3backend-dvips.def 2025-04-14 v L3 backend support: dvips

close source    > level 4, order 7, name 'l3backend-dvips.def'
close source    > level 3, order 7, name 'expl3-generic.tex'
loading         > ConTeXt User Module / markdown
open source     > level 3, order 8, name 'markdown/markdown.tex'
open source     > level 4, order 9, name 'lt3luabridge.tex'

Package luabridge Info: Using direct Lua access as the bridging method


close source    > level 4, order 9, name 'lt3luabridge.tex'
tex error       > tex error on line 3164 in file markdown/markdown.tex: Missing number, case 7, treated as zero

\exp_not:n {j}\s__tl B106\s__tl 

<to be read again> 
    
    e
<line 6.3164> 
      }

<empty file>
A number should have been here; I inserted '0'. (If you can't figure out why I
needed to see a number, look up 'weird error' in the index to The TeXbook.)
mtx-context     | fatal error: return code: 1

This is different from calling context --luatex context-lmtx.tex, which produces the following result instead:

mtx-context     | redirect luametatex -> luatex: luatex --luaonly --socket "/usr/bin/mtxrun.lua" --script mtx-context --luatex context-lmtx.tex --redirected

resolvers       | formats | executing runner 'run luatex format': /usr/local/texlive/2025/bin/x86_64-linux/luatex --jobname="context-lmtx" --socket --shell-escape --fmt=/usr/local/texlive/2025/texmf-var/luatex-cache/context/a86c089b384a3076dc514ba966a1fac9/formats/luatex/cont-en.fmt --lua=/usr/local/texlive/2025/texmf-var/luatex-cache/context/a86c089b384a3076dc514ba966a1fac9/formats/luatex/cont-en.lui cont-yes.mkiv --c:currentrun=1 --c:engine="luatex" --c:fulljobname="./context-lmtx.tex" --c:input="./context-lmtx.tex" --c:kindofrun=1 --c:luatex --c:maxnofruns=9 --c:redirected --c:texmfbinpath="/usr/local/texlive/2025/bin/x86_64-linux"
This is LuaTeX, Version 1.22.0 (TeX Live 2025) 
 system commands enabled.
open source     > level 1, order 1, name '/usr/local/texlive/2025/texmf-dist/tex/context/base/mkiv/cont-yes.mkiv'
system          > 
system          > ConTeXt  ver: 2025.02.28 18:12 MKIV  fmt: 2025.3.30  int: english/english
system          > 
system          > 'cont-new.mkiv' loaded
open source     > level 2, order 2, name '/usr/local/texlive/2025/texmf-dist/tex/context/base/mkiv/cont-new.mkiv'
system          > beware: some patches loaded from cont-new.mkiv
close source    > level 2, order 2, name '/usr/local/texlive/2025/texmf-dist/tex/context/base/mkiv/cont-new.mkiv'
system          > 'cont-sys.mkiv' loaded
open source     > level 2, order 3, name '/usr/local/texlive/2025/texmf-dist/tex/context/texlive/cont-sys.mkiv'
close source    > level 2, order 3, name '/usr/local/texlive/2025/texmf-dist/tex/context/texlive/cont-sys.mkiv'
system          > files > jobname 'context-lmtx', input './context-lmtx', result 'context-lmtx'
fonts           > latin modern fonts are not preloaded
languages       > language 'en' is active
open source     > level 2, order 4, name '/workdir/examples/context-lmtx.tex'
fonts           > preloading latin modern fonts (third stage)
fonts           > 'fallback modern-designsize rm 10pt' is loaded
modules         > 'markdown' is loaded
open source     > level 3, order 5, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdown.tex'
open source     > level 4, order 6, name 'expl3-generic.tex'
open source     > level 5, order 7, name 'expl3-code.tex'
close source    > level 5, order 7, name 'expl3-code.tex'
backend         > calling unavailable pdf.getcreationdate function
open source     > level 5, order 8, name '/root/texmf/tex/latex-dev/l3backend/l3backend-luatex.def'
close source    > level 5, order 8, name '/root/texmf/tex/latex-dev/l3backend/l3backend-luatex.def'
close source    > level 4, order 8, name 'expl3-generic.tex'
loading         > ConTeXt User Module / markdown
open source     > level 4, order 9, name '/usr/local/texlive/texmf-local/tex/generic/markdown/markdown.tex'
open source     > level 5, order 10, name 'lt3luabridge.tex'
close source    > level 5, order 10, name 'lt3luabridge.tex'
close source    > level 4, order 10, name '/usr/local/texlive/texmf-local/tex/generic/markdown/markdown.tex'
modules         > 'markdownthemewitiko_markdown_defaults' is loaded
open source     > level 4, order 11, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdownthemewitiko_markdown_defaults.tex'
open source     > level 5, order 12, name '/usr/local/texlive/texmf-local/tex/generic/markdown/markdownthemewitiko_markdown_defaults.tex'
close source    > level 5, order 12, name '/usr/local/texlive/texmf-local/tex/generic/markdown/markdownthemewitiko_markdown_defaults.tex'
modules         > 'database' is loaded
open source     > level 5, order 13, name '/usr/local/texlive/2025/texmf-dist/tex/context/modules/mkiv/m-database.mkiv'
resolvers       > lua > loading file '/usr/local/texlive/2025/texmf-dist/tex/context/modules/mkiv/m-database.lua' succeeded
close source    > level 5, order 13, name '/usr/local/texlive/2025/texmf-dist/tex/context/modules/mkiv/m-database.mkiv'
close source    > level 4, order 13, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdownthemewitiko_markdown_defaults.tex'
close source    > level 3, order 13, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdown.tex'

lua error       > lua error on line 85 in file /workdir/examples/context-lmtx.tex:

lua-uni-parse.lua:50: Please call kpse.set_program_name() before using the library
stack traceback:
	[C]: in function 'kpse.find_file'
	lua-uni-parse.lua:50: in function 'lua-uni-parse.parse_file'
	lua-uni-case.lua:28: in main chunk
	[C]: in upvalue 'requiem'
	...live/2025/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:180: in function <...live/2025/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:165>
	(...tail calls...)
	lua-uni-algos.lua:17: in main chunk
	[C]: in upvalue 'requiem'
	...live/2025/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:180: in function <...live/2025/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:165>
	(...tail calls...)
	...live/texmf-local/tex/luatex/markdown/markdown-parser.lua:83: in main chunk
	[C]: in upvalue 'requiem'
	...live/2025/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:180: in function <...live/2025/texmf-dist/tex/context/base/mkiv/l-sandbox.lua:165>
	(...tail calls...)
	...cal/texlive/texmf-local/tex/luatex/markdown/markdown.lua:193: in local 'transform'
	...cal/texlive/texmf-local/tex/luatex/markdown/markdown.lua:151: in field 'cache'
	...cal/texlive/texmf-local/tex/luatex/markdown/markdown.lua:205: in local 'convert'
	[ctxlua]:1: in main chunk

75           },
76         },
77       ]
78     
79     \startyaml
80     
81     title:  An Example *Markdown* Document
82     author: Vít Starý Novotný
83     date:   `\currentdate`{=tex}
84     
85 >>  \stopyaml
86     
87     % Typeset the document `example.md` by letting the Markdown package handle
88     % the conversion internally. Optionally, we can specify additional options
89     % between the square brackets similarly to the command `\setupmarkdown`.
90     % Unlike `\setupmarkdown`, the options will only apply for this document.
91     \inputmarkdown[smart_ellipses = yes]{./example.md}
92     
93     % Typeset the document `example.tex` that we prepared separately using the
94     % Lua command-line interface and that contains a plain TeX representation
95     % of the document `example.md`.

mtx-context     | fatal error: return code: 256

The latter seems reasonable: The library lua-uni-algos requires the library KPathSea, which is generally unavailable in ConTeXt LMTX, as discussed in https://github.com/Witiko/markdown/pull/557#issuecomment-2674213438 and elsewhere. We may need to get rid of our dependency on the library lua-uni-algos if we wish to support ConTeXt LMTX (and ConTeXt standalone more generally) properly.

The former does not seem reasonable and seems to indicate that there are still corner cases where token.set_char and tex.chardef produce different results.

Witiko avatar Apr 16 '25 18:04 Witiko

I applied the latest update from @josephwright in https://github.com/latex3/latex3/pull/1725 and removed all calls to the library KPathSea from the file markdown-parser.lua.

75,83d74
< ;(function()
<   local should_initialize = package.loaded.kpse == nil
<                        or tex.initialize ~= nil
<   kpse = require("kpse")
<   if should_initialize then
<     kpse.set_program_name("luatex")
<   end
< end)()
< local uni_algos = require("lua-uni-algos")
6933,6934c6924
<     for _, pathname in ipairs{kpse.lookup(language_map,
<                                           {all=true})} do
---
>     for _, pathname in ipairs{} do
8881,8883c8871
<       local pathname = assert(kpse.find_file(filename),
<         [[Could not locate user-defined syntax extension "]]
<         .. filename)
---
>       local pathname = filename

After these changes, running context context-lmtx.tex produces the following error:

resolvers       | formats | executing runner 'run luametatex format': /usr/local/texlive/2025/bin/x86_64-linux/luametatex --jobname="./context-lmtx.tex" --socket --shell-escape --fmt=/usr/local/texlive/2025/texmf-var/luametatex-cache/context/a86c089b384a3076dc514ba966a1fac9/formats/luametatex/cont-en.fmt --lua=/usr/local/texlive/2025/texmf-var/luametatex-cache/context/a86c089b384a3076dc514ba966a1fac9/formats/luametatex/cont-en.lui  --c:currentrun=1 --c:fulljobname="./context-lmtx.tex" --c:input="./context-lmtx.tex" --c:kindofrun=1 --c:maxnofruns=9 --c:texmfbinpath="/usr/bin"
system          > 
system          > ConTeXt  ver: 2025.02.28 18:12 LMTX  fmt: 2025.3.30  int: english/english
system          > 
system          > 'cont-new.mkxl' loaded
open source     > level 1, order 1, name '/usr/local/texlive/2025/texmf-dist/tex/context/base/mkxl/cont-new.mkxl'
system          > beware: some patches loaded from cont-new.mkiv
close source    > level 1, order 1, name '/usr/local/texlive/2025/texmf-dist/tex/context/base/mkxl/cont-new.mkxl'
system          > 'cont-sys.mkxl' loaded
open source     > level 1, order 2, name '/usr/local/texlive/2025/texmf-dist/tex/context/texlive/cont-sys.mkxl'
close source    > level 1, order 2, name '/usr/local/texlive/2025/texmf-dist/tex/context/texlive/cont-sys.mkxl'
system          > files > jobname './context-lmtx', input './context-lmtx.tex', result './context-lmtx'
fonts           > latin modern fonts are not preloaded
languages       > language 'en' is active
open source     > level 1, order 3, name './context-lmtx.tex'
fonts           > preloading latin modern fonts (third stage)
fonts           > 'fallback modern rm 10pt' is loaded
modules         > 'markdown' is loaded
open source     > level 2, order 4, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdown.tex'
open source     > level 3, order 5, name 'expl3-generic.tex'
Package: expl3 2025-04-14 L3 programming layer (loader)

open source     > level 4, order 6, name 'expl3-code.tex'
Package: expl3 2025-04-14 L3 programming layer (code)

close source    > level 4, order 6, name 'expl3-code.tex'
backend         > calling unavailable pdf.getcreationdate function
open source     > level 4, order 7, name 'l3backend-dvips.def'
File: l3backend-dvips.def 2025-03-14 v L3 backend support: dvips

close source    > level 4, order 7, name 'l3backend-dvips.def'
close source    > level 3, order 7, name 'expl3-generic.tex'
loading         > ConTeXt User Module / markdown
open source     > level 3, order 8, name 'markdown/markdown.tex'
open source     > level 4, order 9, name 'lt3luabridge.tex'

Package luabridge Info: Using direct Lua access as the bridging method


close source    > level 4, order 9, name 'lt3luabridge.tex'
close source    > level 3, order 9, name 'markdown/markdown.tex'

Package markdown Info: Loading version latest of ConTeXt Markdown theme
(markdown)             witiko/markdown/defaults


modules         > 'markdownthemewitiko_markdown_defaults' is loaded
open source     > level 3, order 10, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdownthemewitiko_markdown_defaults.tex'

Package markdown Info: Loading version latest of plain TeX Markdown theme
(markdown)             witiko/markdown/defaults


open source     > level 4, order 11, name 'markdownthemewitiko_markdown_defaults.tex'
close source    > level 4, order 11, name 'markdownthemewitiko_markdown_defaults.tex'
modules         > 'database' is loaded
open source     > level 4, order 12, name '/usr/local/texlive/2025/texmf-dist/tex/context/modules/mkiv/m-database.mkiv'
pages           > flushing realpage 1, userpage 1, subpage 1
pages           > flushing realpage 2, userpage 2, subpage 2
pages           > flushing realpage 3, userpage 3, subpage 3
resolvers       > lua > loading file '/usr/local/texlive/2025/texmf-dist/tex/context/modules/mkiv/m-database.lua' succeeded
close source    > level 4, order 12, name '/usr/local/texlive/2025/texmf-dist/tex/context/modules/mkiv/m-database.mkiv'
close source    > level 3, order 12, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdownthemewitiko_markdown_defaults.tex'
close source    > level 2, order 12, name '/usr/local/texlive/texmf-local/tex/context/third/markdown/t-markdown.tex'

Package markdown Info: Buffering block-level markdown input into the temporary
(markdown)             input file "context-lmtx.markdown.in" and scanning for
(markdown)             the closing token sequence "\stopyaml"



Package markdown Info: The ending token sequence was found



Package markdown Info: Including markdown document
(markdown)             "./context-lmtx.markdown.in"


lua error       > lua error on line 85 in file ./context-lmtx.tex:

token call, execute: ...live/texmf-local/tex/luatex/markdown/markdown-parser.lua:6630: attempt to index a nil value (global 'uni_algos')
stack traceback:
	...live/texmf-local/tex/luatex/markdown/markdown-parser.lua:6630: in function <...live/texmf-local/tex/luatex/markdown/markdown-parser.lua:6626>
	(...tail calls...)
	...cal/texlive/texmf-local/tex/luatex/markdown/markdown.lua:151: in field 'cache'
	...cal/texlive/texmf-local/tex/luatex/markdown/markdown.lua:205: in local 'convert'
	[ctxlua]:1: in main chunk
75           },
76         },
77       ]
78     
79     \startyaml
80     
81     title:  An Example *Markdown* Document
82     author: Vít Starý Novotný
83     date:   `\currentdate`{=tex}
84     
85 >>  \stopyaml
86     
87     % Typeset the document `example.md` by letting the Markdown package handle
88     % the conversion internally. Optionally, we can specify additional options
89     % between the square brackets similarly to the command `\setupmarkdown`.
90     % Unlike `\setupmarkdown`, the options will only apply for this document.
91     \inputmarkdown[smart_ellipses = yes]{./example.md}
92     
93     % Typeset the document `example.tex` that we prepared separately using the
94     % Lua command-line interface and that contains a plain TeX representation
95     % of the document `example.md`.
mtx-context     | fatal error: return code: 1

This seems comparable to the output of the command context --luatex context-lmtx.tex from https://github.com/Witiko/markdown/pull/557#issuecomment-2810449560, i.e. no errors besides those related to the library lua-uni-algos.

Then, I removed all uses of the library lua-uni-algos from the file markdown-parser.lua.

3025,3028d3024
<     if not options.unicodeNormalization
<        or options.unicodeNormalizationForm ~= "nfc" then
<       normalized_s = uni_algos.normalize.NFC(normalized_s)
<     end
3072,3073d3067
<         -- Case-fold all alphabetic characters.
<         char = uni_algos.case.casefold(char)
3095,3098d3088
<     if not options.unicodeNormalization
<        or options.unicodeNormalizationForm ~= "nfc" then
<       normalized_s = uni_algos.normalize.NFC(normalized_s)
<     end
3142,3143d3131
<         -- Case-fold all alphabetic characters.
<         char = uni_algos.case.casefold(char)
4849d4836
<     tag = uni_algos.case.casefold(tag, true, false)
6627,6641d6613
<       if options.unicodeNormalization then
<         local form = options.unicodeNormalizationForm
<         if form == "nfc" then
<           input = uni_algos.normalize.NFC(input)
<         elseif form == "nfd" then
<           input = uni_algos.normalize.NFD(input)
<         elseif form == "nfkc" then
<           input = uni_algos.normalize.NFKC(input)
<         elseif form == "nfkd" then
<           input = uni_algos.normalize.NFKD(input)
<         else
<           return writer.error(
<             format("Unknown normalization form %s.", form))
<         end
<       end

This allowed me to compile the file context-lmtx.tex, producing the PDF document context-lmtx.pdf with the command context context-lmtx.tex with no errors. This indicates that getting rid of the library lua-uni-algos is the main obstacle in supporting ConTeXt LMTX.

There are other concerns, which seem minor in comparison. Namely, there are numerous occurrences of the texts "haracter U+007D 'right curly bracket'" and "haracter U+007B 'left curly bracket'" in the PDF document. Namely, the first three pages of the document consist entirely of these:

image

However, they also occur elsewhere in the document:

image

These texts aren't present in the output of the command context --luatex context-lmtx.tex and seem to be a result of the interplay between expl3 and LuaMetaTeX. More analysis is needed to determine the exact cause; I will try to isolate a minimum working example.

Witiko avatar Apr 21 '25 12:04 Witiko

There are other concerns, [...]

I created a minimal working example and opened a ticket upstream in https://github.com/latex3/latex3/issues/1728.

Witiko avatar Apr 23 '25 14:04 Witiko

This allowed me to compile the file context-lmtx.tex, producing the PDF document context-lmtx.pdf with the command context context-lmtx.tex [...]. There are other concerns, [...]. Namely, there are numerous occurrences of the texts "haracter U+007D 'right curly bracket'" and "haracter U+007B 'left curly bracket'" in the PDF document.

I created a minimal working example and opened a ticket upstream in latex3/latex3#1728.

I am getting a clean compile of the file examples/context-lmtx.tex since https://github.com/latex3/latex3/commit/a3a1df86c09fc4f5caa47f04514c1ead5741e19f by @josephwright:

image

Based on https://github.com/Witiko/markdown/pull/557#issuecomment-2805084481 and below, here are the necessary steps than need to be taken before the CI is green:

  1. [x] In markdown-parser.lua, remove dependency on library lua-uni-algos.
  2. Remove the part of file context-lmtx.tex that loads KPathSea.
  3. Make all uses of KPathSea in markdown-parser.lua optional: both initialization and use should fail softly.

To fully resolve https://github.com/Witiko/markdown/issues/436 and close ticket https://github.com/Witiko/markdown/issues/402, here are some of the next steps for v3.11.3, first outlined in https://github.com/Witiko/markdown/pull/557#issuecomment-2766075259:

  1. Add ConTeXt LMTX to the testing framework.
  2. Write installation instructions for ConTeXt standalone.
  3. Adding a Dockerfile for ConTeXt standalone, continuously build a Docker image, and run tests with it.

After this has been done, we should publicize the support for ConTeXt LMTX:

  1. Let @hason know that they can revert the last hunk of https://github.com/minidocks/context/commit/e3d8d3b2daaec6bd1effd785d6115a4c2d9b50de that disables Markdown package.
  2. Respond to the thread at the [email protected] mailing list from November 2024.

Don't forget to mention @josephwright's numerous contributions in CHANGES.md. Specifically, Ctrl+F "latex/latex3#" in this PR to find references to all upstream changes to the LaTeX3 kernel needed to get this PR working.

Witiko avatar Apr 23 '25 17:04 Witiko

  1. Respond to the thread at the [email protected] mailing list from November 2024.

The linked message contains the following remarks by @hanshagen:

[I]n order to test this ; issue I downloaded the expl generic file and (with a change in the lua file -- different function) it could be loaded but esp the unicode part takes quite some time, likely more than users will accept in an edit-run cycle that normally takes a few seconds per run. [...] Of course, when you need some specific helpers we can provide them[.]

Let's perhaps respond with a question about if and how we could recompile the ConTeXt format files to include expl3-generic.tex. Then, we can:

  1. Include steps to recompile ConTeXt format files to include expl3 in installation instructions and our Dockerfiles.

In the future, if there is sufficient demand, expl3 could perhaps be included with ConTeXt; if not by default, then with a simple runtime switch. Official support for expl3 in ConTeXt would ensure better long-term support. At the moment, expl3 contains some support for LuaMetaTeX with minimal tests thanks to @josephwright's efforts but there is currently no testing at the ConTeXt side and any changes to LuaMetaTeX are potentially breaking for us.

Witiko avatar Apr 23 '25 17:04 Witiko

@Witiko, what can I do to help you with the ongoing work?

andreiborisov avatar Jun 05 '25 15:06 andreiborisov

  1. Respond to the thread at the [email protected] mailing list from November 2024.

The linked message contains the following remarks by @hanshagen:

[I]n order to test this ; issue I downloaded the expl generic file and (with a change in the lua file -- different function) it could be loaded but esp the unicode part takes quite some time, likely more than users will accept in an edit-run cycle that normally takes a few seconds per run. [...] Of course, when you need some specific helpers we can provide them[.]

Let's perhaps respond with a question about if and how we could recompile the ConTeXt format files to include expl3-generic.tex. Then, we can:

  1. Include steps to recompile ConTeXt format files to include expl3 in installation instructions and our Dockerfiles.

In the future, if there is sufficient demand, expl3 could perhaps be included with ConTeXt; if not by default, then with a simple runtime switch. Official support for expl3 in ConTeXt would ensure better long-term support. At the moment, expl3 contains some support for LuaMetaTeX with minimal tests thanks to @josephwright's efforts but there is currently no testing at the ConTeXt side and any changes to LuaMetaTeX are potentially breaking for us.

I responded to the mailing list on 2025-04-30 but there was no response over the past month. This means that we'll need to figure out how to include expl3 in the ConTeXt format files ourselves.

Witiko avatar Jun 06 '25 09:06 Witiko

@andreiborisov: No worries, I hope you are doing OK!

Could you please try and retrace my steps and see whether the file examples/context-lmtx.tex compiles using LMTX from ConTeXt Standalone? It builds cleanly for me (see image) with TeX Live 2025, but ConTeXt Standalone likely has its own quirks and an extra set of eyes would be valuable. The requirements are outlined in https://github.com/Witiko/markdown/pull/557#issuecomment-2674213644 and further necessary modifications of the Markdown package and the file context-lmtx.tex are outlined in https://github.com/Witiko/markdown/pull/557#issuecomment-2805084481 and below and summarized in points 1–3 from https://github.com/Witiko/markdown/pull/557#issuecomment-2824988358.

Coming up next, in PR https://github.com/Witiko/markdown/pull/569, I'm working on removing the lua-uni-algos dependency. That's orthogonal to this PR but should simplify the installation with ConTeXt Standalone. My remaining tasks for this PR are in points 4–9 of https://github.com/Witiko/markdown/pull/557#issuecomment-2824988358 and https://github.com/Witiko/markdown/pull/557#issuecomment-2825026752. Progress is slow because I've been focused on expltools over the past several months, so any help with getting the Markdown package to work with ConTeXt Standalone and LMTX is welcome.

Thanks a lot for testing—let me know what you find or if anything looks off!

Witiko avatar Jun 06 '25 09:06 Witiko

I responded to the mailing list on 2025-04-30 but there was no response over the past month. This means that we'll need to figure out how to include expl3 in the ConTeXt format files ourselves.

The vast majority of the load time is taken up reading Unicode data files and storing in an appropriate format. One could perhaps provide a 'skip the Unicode' loading method at the expl3 end, or it would be possible to explore alternative loading mechanisms in Lua. What has never been clear to me is if the data are already available in LuaTeX: Lua itself doesn't really 'do' Unicode, and the LuaTeX docs don't give details either of compiled-in Unicode data versions or of dynamic file loading.

josephwright avatar Jun 06 '25 12:06 josephwright

What has never been clear to me is if the data are already available in LuaTeX: Lua itself doesn't really 'do' Unicode, and the LuaTeX docs don't give details either of compiled-in Unicode data versions or of dynamic file loading.

LuaTeX includes the Selene Unicode library, which provides not only support for encoding and decoding UTF-8 but also includes parts of UnicodeData.txt to enable matching based on Unicode character categories with the function unicode.utf8.match(). Documentation is lacking and the library does not seem actively developed, so I would suggest looking at the source code for any definitive answers about the supported version of the Unicode standard. If I recall correctly, the character category tables were hard-coded as constants there.

LuaMetaTeX does not seem to include either Selene Unicode or any other interfaces related to UTF-8 or Unicode. In order to support LuaMetaTeX, I removed the dependency on Selene Unicode in PR https://github.com/Witiko/markdown/pull/551 this February and I am working to remove the dependency on the third-party library lua-uni-algos in PR https://github.com/Witiko/markdown/pull/569 in the upcoming months.

Witiko avatar Jun 06 '25 12:06 Witiko

The vast majority of the load time is taken up reading Unicode data files and storing in an appropriate format.

I don't mind that one bit, since we'd be doing that only once when the ConTeXt formats are built. At runtime, the Unicode data should already be part of the macros in the format dump. The concept is clear; the hard part is figuring out which files to edit and which commands to run to regenerate the ConTeXt formats with expl3 included.

Witiko avatar Jun 06 '25 12:06 Witiko

What has never been clear to me is if the data are already available in LuaTeX: Lua itself doesn't really 'do' Unicode, and the LuaTeX docs don't give details either of compiled-in Unicode data versions or of dynamic file loading.

LuaTeX includes the Selene Unicode library, which provides not only support for encoding and decoding UTF-8 but also includes parts of the Unicode data to enable matching based on Unicode character categories with the function unicode.utf8.match(). Documentation is lacking and the library does not seem actively developed, so I would suggest looking at the source code for any definitive answers about the supported version of the Unicode standard. If I recall correctly, the character category tables were hard-coded as constants there.

Like I said, no details of the Unicode version at all :(

LuaMetaTeX does not seem to include either Selene Unicode or any other interfaces related to UTF-8 or Unicode. In order to support LuaMetaTeX, I removed the dependency on Selene Unicode in PR #551 this February and I am working to remove the dependency on the third-party library lua-uni-algos in PR #569 in the following months.

josephwright avatar Jun 06 '25 12:06 josephwright

Like I said, no details of the Unicode version at all :(

If the character category data is encoded as constants, you'd need to convert the constants to a more humane format and then compare them with the UnicodeData.txt files from different versions of the Unicode standard. It's a well-defined task, only mind-numbing and time-consuming. :/

An educated guess about the version of the Unicode standard can also be made from the last modified date of the source code.

Witiko avatar Jun 06 '25 12:06 Witiko

@josephwright Would you like me to look into which version of Unicode is supported by the Selene Unicode library that is bundled with LuaTeX?

Witiko avatar Jun 13 '25 14:06 Witiko

@josephwright Would you like me to look into which version of Unicode is supported by the Selene Unicode library that is bundled with LuaTeX?

I'm not really fussed - my point was more that it's not part of the official docs, so one can't rely on it.

josephwright avatar Jun 13 '25 15:06 josephwright

And it's not part of LuaMetaTeX either. That's also why I decided not to rely on it and implemented the relevant algorithms myself, much like you did in expl3. It's a bummer, since a built-in C library would be much faster then our LPEG-based Lua implementation and likely much much much faster than the TeX-based implementation in expl3.

Regardless, if you'd like to know, I can take a look. I am somewhat curious about what I am going to find. If it's an old version of Unicode, then that may be of interest to LuaTeX developers as well. I have seen discussions about whether Selene should be removed or replaced on the LuaTeX mailing list before, although I can't find it now.

Witiko avatar Jun 13 '25 19:06 Witiko

@josephwright: Let's have a look.

An educated guess about the version of the Unicode standard can also be made from the last modified date of the source code.

The last commit and release for Selene Unicode were both made in 2014 before the release of Unicode 7.0. However, if the date from the $Id$ SVN keyword in the file slnudata.c can be trusted, then the Unicode data were extracted in 2006, around the release of Unicode 5.0. Therefore, the Unicode standard supported by Selene Unicode is no newer than Unicode 4.1 or 6.3, based on whether the SVN keyword can be trusted.

If the character category data is encoded as constants, you'd need to convert the constants to a more humane format and then compare them with the UnicodeData.txt files from different versions of the Unicode standard. It's a well-defined task, only mind-numbing and time-consuming. :/

The character category data is encoded in the file slnudata.c, which has apparently been automatically generated from a file UnicodeData.txt using the script uniParse.tcl. Therefore, I used the script to process the files UnicodeData.txt from all versions of the Unicode standard between 1.0 and 6.3 and made a comparison.

First, I downloaded the file UnicodeData.txt for all versions between 1.0 and 6.3.

$ docker run --rm -it ubuntu
root@d88dd5360920:/# apt update
root@d88dd5360920:/# apt install -qy wget
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/1.1-Update/UnicodeData-1.1.5.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/2.0-Update/UnicodeData-2.0.14.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/2.1-Update/UnicodeData-2.1.2.txt 
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/2.1-Update2/UnicodeData-2.1.5.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/2.1-Update3/UnicodeData-2.1.8.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/2.1-Update4/UnicodeData-2.1.9.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.txt 
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/3.0-Update1/UnicodeData-3.0.1.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/3.1-Update/UnicodeData-3.1.0.txt 
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/3.2-Update/UnicodeData-3.2.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/4.0-Update/UnicodeData-4.0.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/4.1.0/ucd/UnicodeData.txt -O UnicodeData-4.1.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/5.0.0/ucd/UnicodeData.txt -O UnicodeData-5.0.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt -O UnicodeData-5.1.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/5.2.0/ucd/UnicodeData.txt -O UnicodeData-5.2.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/6.0.0/ucd/UnicodeData.txt -O UnicodeData-6.0.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/6.1.0/ucd/UnicodeData.txt -O UnicodeData-6.1.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/6.2.0/ucd/UnicodeData.txt -O UnicodeData-6.2.0.txt
root@d88dd5360920:/# wget --quiet https://www.unicode.org/Public/6.3.0/ucd/UnicodeData.txt -O UnicodeData-6.3.0.txt
root@d88dd5360920:/# ls -lh UnicodeData*.txt
-rw-r--r-- 1 root root  888K Feb 27  2001 UnicodeData-1.1.5.txt
-rw-r--r-- 1 root root  434K Feb 27  2001 UnicodeData-2.0.14.txt
-rw-r--r-- 1 root root  435K Feb 27  2001 UnicodeData-2.1.2.txt
-rw-r--r-- 1 root root  453K Feb 27  2001 UnicodeData-2.1.5.txt
-rw-r--r-- 1 root root  453K Feb 27  2001 UnicodeData-2.1.8.txt
-rw-r--r-- 1 root root  450K Feb 27  2001 UnicodeData-2.1.9.txt
-rw-r--r-- 1 root root  622K Feb 27  2001 UnicodeData-3.0.0.txt
-rw-r--r-- 1 root root  622K Feb 27  2001 UnicodeData-3.0.1.txt
-rw-r--r-- 1 root root  763K Feb 27  2001 UnicodeData-3.1.0.txt
-rw-r--r-- 1 root root  818K Feb 27  2002 UnicodeData-3.2.0.txt
-rw-r--r-- 1 root root  877K Mar 31  2003 UnicodeData-4.0.0.txt
-rw-r--r-- 1 root root  944K Feb 16  2005 UnicodeData-4.1.0.txt
-rw-r--r-- 1 root root 1015K May 23  2006 UnicodeData-5.0.0.txt
-rw-r--r-- 1 root root  1.1M Mar 19  2008 UnicodeData-5.1.0.txt
-rw-r--r-- 1 root root  1.2M Aug 17  2009 UnicodeData-5.2.0.txt
-rw-r--r-- 1 root root  1.3M Aug 17  2010 UnicodeData-6.0.0.txt
-rw-r--r-- 1 root root  1.4M Nov  8  2011 UnicodeData-6.1.0.txt
-rw-r--r-- 1 root root  1.4M Aug  8  2012 UnicodeData-6.2.0.txt
-rw-r--r-- 1 root root  1.4M May 15  2013 UnicodeData-6.3.0.txt

Then, I used the script uniParse.tcl to generate the C file for each Unicode version.

root@d88dd5360920:/# wget --quiet https://spacegit.unibe.ch/bela/mb-linux-msli/-/raw/38b8bc2d8ee86a6c4d0d83340decd65566b34139/uClinux-dist/user/tcl/tools/uniParse.tcl?inline=false -O uniParse.tcl
root@d88dd5360920:/# apt install -qy parallel tcl
root@d88dd5360920:/# parallel -- 'mkdir -p out-{}/; tclsh uniParse.tcl {} out-{}/' ::: UnicodeData*.txt
root@d88dd5360920:/# ls -lh out-*.txt/tclUniData.c
-rw-r--r-- 1 root root 37K Jun 18 13:13 out-UnicodeData-2.0.14.txt/tclUniData.c
-rw-r--r-- 1 root root 37K Jun 18 13:13 out-UnicodeData-2.1.2.txt/tclUniData.c
-rw-r--r-- 1 root root 37K Jun 18 13:13 out-UnicodeData-2.1.5.txt/tclUniData.c
-rw-r--r-- 1 root root 37K Jun 18 13:13 out-UnicodeData-2.1.8.txt/tclUniData.c
-rw-r--r-- 1 root root 37K Jun 18 13:13 out-UnicodeData-2.1.9.txt/tclUniData.c
-rw-r--r-- 1 root root 42K Jun 18 13:13 out-UnicodeData-3.0.0.txt/tclUniData.c
-rw-r--r-- 1 root root 51K Jun 18 13:13 out-UnicodeData-3.0.1.txt/tclUniData.c
-rw-r--r-- 1 root root 60K Jun 18 13:13 out-UnicodeData-3.1.0.txt/tclUniData.c
-rw-r--r-- 1 root root 61K Jun 18 13:13 out-UnicodeData-3.2.0.txt/tclUniData.c
-rw-r--r-- 1 root root 63K Jun 18 13:13 out-UnicodeData-4.0.0.txt/tclUniData.c
-rw-r--r-- 1 root root 65K Jun 18 13:13 out-UnicodeData-4.1.0.txt/tclUniData.c
-rw-r--r-- 1 root root 68K Jun 18 13:13 out-UnicodeData-5.0.0.txt/tclUniData.c
-rw-r--r-- 1 root root 71K Jun 18 13:13 out-UnicodeData-5.1.0.txt/tclUniData.c
-rw-r--r-- 1 root root 73K Jun 18 13:13 out-UnicodeData-5.2.0.txt/tclUniData.c
-rw-r--r-- 1 root root 74K Jun 18 13:13 out-UnicodeData-6.0.0.txt/tclUniData.c
-rw-r--r-- 1 root root 75K Jun 18 13:13 out-UnicodeData-6.1.0.txt/tclUniData.c
-rw-r--r-- 1 root root 75K Jun 18 13:13 out-UnicodeData-6.2.0.txt/tclUniData.c
-rw-r--r-- 1 root root 75K Jun 18 13:13 out-UnicodeData-6.3.0.txt/tclUniData.c

Next, I downloaded the file slnudata.c and normalized the generated files and the file slnudata.c into the same format.

root@d88dd5360920:/# wget --quiet https://github.com/LuaDist/slnunicode/raw/e8abd35c5f0f5a9084442d8665cbc9c3d169b5fd/slnudata.c -O slnudata.c
root@d88dd5360920:/# parallel -- 'sed -ni "/^#define OFFSET_BITS/,\$p" {}' ::: out-*.txt/tclUniData.c slnudata.c

Finally, I compared the normalized generated files with the normalized file slnudata.c.

root@d88dd5360920:/# parallel -- 'diff slnudata.c out-{}/tclUniData.c > diff-{}' ::: UnicodeData*.txt
root@d88dd5360920:/# ls -lh diff-*.txt
-rw-r--r-- 1 root root  80K Jun 18 13:22 diff-UnicodeData-2.0.14.txt
-rw-r--r-- 1 root root  81K Jun 18 13:22 diff-UnicodeData-2.1.2.txt
-rw-r--r-- 1 root root  79K Jun 18 13:22 diff-UnicodeData-2.1.5.txt
-rw-r--r-- 1 root root  79K Jun 18 13:22 diff-UnicodeData-2.1.8.txt
-rw-r--r-- 1 root root  79K Jun 18 13:22 diff-UnicodeData-2.1.9.txt
-rw-r--r-- 1 root root  50K Jun 18 13:22 diff-UnicodeData-3.0.0.txt
-rw-r--r-- 1 root root  42K Jun 18 13:22 diff-UnicodeData-3.0.1.txt
-rw-r--r-- 1 root root 2.7K Jun 18 13:22 diff-UnicodeData-3.1.0.txt
-rw-r--r-- 1 root root  99K Jun 18 13:22 diff-UnicodeData-3.2.0.txt
-rw-r--r-- 1 root root 104K Jun 18 13:22 diff-UnicodeData-4.0.0.txt
-rw-r--r-- 1 root root 107K Jun 18 13:22 diff-UnicodeData-4.1.0.txt
-rw-r--r-- 1 root root 120K Jun 18 13:22 diff-UnicodeData-5.0.0.txt
-rw-r--r-- 1 root root 122K Jun 18 13:22 diff-UnicodeData-5.1.0.txt
-rw-r--r-- 1 root root 125K Jun 18 13:22 diff-UnicodeData-5.2.0.txt
-rw-r--r-- 1 root root 125K Jun 18 13:22 diff-UnicodeData-6.0.0.txt
-rw-r--r-- 1 root root 126K Jun 18 13:22 diff-UnicodeData-6.1.0.txt
-rw-r--r-- 1 root root 126K Jun 18 13:22 diff-UnicodeData-6.2.0.txt
-rw-r--r-- 1 root root 127K Jun 18 13:22 diff-UnicodeData-6.3.0.txt

The normalized generated file for Unicode 3.1 had the smallest diff, which was only due to indentation and the trailing newline at the end of the file:

root@d88dd5360920:/# cat diff-UnicodeData-3.1.0.txt
810,826c810,827
<     29, 2, 23, 11, 1178599554, 24, -507510654, 4194369, 4194434, -834666431, 
<     973078658, -507510719, 1258291330, 880803905, 864026689, 859832385, 
<     331350081, 847249473, 851443777, 868220993, -406847358, 884998209, 
<     876609601, 893386817, 897581121, 914358337, 910164033, 918552641, 
<     5, -234880894, 8388705, 4194499, 8388770, 331350146, -406847423, 
<     -234880959, 880803970, 864026754, 859832450, 847249538, 851443842, 
<     868221058, 876609666, 884998274, 893386882, 897581186, 914358402, 
<     910164098, 918552706, 4, 6, -352321402, 159383617, 155189313, 
<     268435521, 264241217, 159383682, 155189378, 130023554, 268435586, 
<     264241282, 260046978, 239075458, 1, 197132418, 226492546, 360710274, 
<     335544450, -251658175, 402653314, 335544385, 7, 201326657, 201326722, 
<     16, 8, 10, 247464066, -33554302, -33554367, -310378366, -360710014, 
<     -419430270, -536870782, -469761918, -528482174, -33554365, -37748606, 
<     -310378431, -37748669, 155189378, -360710079, -419430335, -29359998, 
<     -469761983, -29360063, -536870847, -528482239, 13, 14, -1463812031, 
<     -801111999, -293601215, 67108938, 67109002, 109051997, 109052061, 
<     18, 17, 8388673, 12582977, 8388738, 12583042
---
>     29, 2, 23, 11, -3116367742, 24, -507510654, 4194369, 4194434, 
>     -834666431, 973078658, -507510719, 1258291330, 880803905, 864026689, 
>     859832385, 331350081, 847249473, 851443777, 868220993, -406847358, 
>     884998209, 876609601, 893386817, 897581121, 914358337, 910164033, 
>     918552641, 5, -234880894, 8388705, 4194499, 8388770, 331350146, 
>     -406847423, -234880959, 880803970, 864026754, 859832450, 847249538, 
>     851443842, 868221058, 876609666, 884998274, 893386882, 897581186, 
>     914358402, 910164098, 918552706, 4, 6, -352321402, 159383617, 
>     155189313, 268435521, 264241217, 159383682, 155189378, 130023554, 
>     268435586, 264241282, 260046978, 239075458, 1, 197132418, 226492546, 
>     360710274, 335544450, -251658175, 402653314, 335544385, 7, 201326657, 
>     201326722, 16, 8, 10, 247464066, -33554302, -33554367, -310378366, 
>     -360710014, -419430270, -536870782, -469761918, -528482174, -33554365, 
>     -37748606, -310378431, -37748669, 30219960450, -360710079, -419430335, 
>     -29359998, -469761983, -29360063, -536870847, -528482239, 13, 
>     14, -31528583103, -35160850367, -34653339583, 67108938, 67109002, 
>     109051997, 109052061, 18, 17, 8388673, 12582977, 8388738, 12583042
>     
877c878
< #define GetDelta(info) (((info) > 0) ? ((info) >> 22) : (~(~((info)) >> 22)))
---
> #define GetDelta(infO) (((info) > 0) ? ((info) >> 22) : (~(~((info)) >> 22)))

The results indicate that Selene Unicode uses the file UnicodeData.txt from Unicode 3.1, released in March 2001.

However, it is possible that LuaTeX uses an updated version of the file slnudata.c, which can be easily generated from the current file UnicodeData.txt from Unicode 16, released last September, using the script uniParse.tcl.

Witiko avatar Jun 18 '25 11:06 Witiko

@Witiko, I followed along and successfully built a ConTeXt standalone-based Docker image using these packages:

  • markdown-3.11.4
  • l3kernel-2025-07-20-dev
  • l3backend-2025-07-20-dev
  • unicode-data-1.18
  • lua-uni-algos-0.4.1 (patched to hard-code paths so it won't use KPathSea library)
  • lt3luabridge-2.2.2
  • lua-tinyyaml-0.4.4

I've encountered a strange bug, though, for some reason it writes "output.markdown.in" instead of output.markdown.in and then can't find the file. When you remove the quotes manually, it compiles successfully. I'll proceed with the more complex project to check if everything works as it should and report back.

The biggest problem so far is the time it takes to load expl3-code.tex when compiling. I wonder if it's possible to tree-shake it to reduce the time.

andreiborisov avatar Jul 24 '25 15:07 andreiborisov

lua-uni-algos-0.4.1 (patched to hard-code paths so it won't use KPathSea library)

This dependency will be removed after https://github.com/Witiko/markdown/pull/569.

I've encountered a strange bug, though, for some reason it writes "output.markdown.in" instead of output.markdown.in and then can't find the file. When you remove the quotes manually, it compiles successfully. I'll proceed with the more complex project to check if everything works as it should and report back.

Recently, in https://github.com/Witiko/markdown/pull/571, we started adding quotes around the filenames to enable files with spaces:

  • https://github.com/Witiko/markdown/blob/30ee1e58c76e20ccefdec1c75a6a0959da2ea4b4/markdown.dtx#L37350-L37363
  • https://github.com/Witiko/markdown/blob/30ee1e58c76e20ccefdec1c75a6a0959da2ea4b4/markdown.dtx#L37574-L37586

I assume that this convention does not fly with LuaMetaTeX? We may as well revert this change, since supporting all different configurations turns out to be an impossible task, as discussed in https://github.com/Witiko/markdown/issues/573, so it may be easier to simply forbid spaces in filenames across the board. This will make the behavior more predictable and also allow us to simplify the code and support LuaMetaTeX.

The biggest problem so far is the time it takes to load expl3-code.tex when compiling. I wonder if it's possible to tree-shake it to reduce the time.

Not tree-shake it but include it in the ConTeXt formats themselves, like LaTeX does it. This way, it won't actually be loaded on runtime but baked into the format. I discuss this in https://github.com/Witiko/markdown/pull/557#issuecomment-2948577642 in more detail.

Witiko avatar Jul 24 '25 18:07 Witiko

We may as well revert this change, since supporting all different configurations turns out to be an impossible task, as discussed in #573, so it may be easier to simply forbid spaces in filenames across the board. This will make the behavior more predictable and also allow us to simplify the code and support LuaMetaTeX.

Since issues with spaces in TeX in general are part of the course, I think it's reasonable suggestion.

Not tree-shake it but include it in the ConTeXt formats themselves, like LaTeX does it. This way, it won't actually be loaded on runtime but baked into the format. I discuss this in #557 (comment) in more detail.

ConTeXt has some notion of format, which you can prebuild to speed up compilation, but I haven't found documentation about it yet.

I've posted on the mailing lists to get some guidance on this.

andreiborisov avatar Jul 27 '25 14:07 andreiborisov

Recently, in #571, we started adding quotes around the filenames to enable files with spaces. I assume that this convention does not fly with LuaMetaTeX?

Yep, I can confirm that using the 3.11.2 version of the Markdown package resolves the issue.

andreiborisov avatar Jul 27 '25 16:07 andreiborisov

Yep, I can confirm that using the 3.11.2 version of the Markdown package resolves the issue.

Thanks for the confirmation, PR https://github.com/Witiko/markdown/pull/582 should fix the issue, then.

Not tree-shake it but include it in the ConTeXt formats themselves, like LaTeX does it. This way, it won't actually be loaded on runtime but baked into the format. I discuss this in #557 (comment) in more detail.

ConTeXt has some notion of format, which you can prebuild to speed up compilation, but I haven't found documentation about it yet.

I've posted on the mailing lists to get some guidance on this.

Have you got any response? I also posted to the mailing lists earlier this year, see https://github.com/Witiko/markdown/pull/557#issuecomment-2948577642, but got no response.

Regardless, I would not submit any further posts. Otherwise, someone might get annoyed with us. I am sure we can figure it out on our own; it should be no more difficult than placing \input expl3-generic somewhere in the format files and running a command to regenerate the formats, e.g. by looking at projects like @gucci-on-fleek's context-packaging.

Witiko avatar Aug 14 '25 17:08 Witiko

@Witiko

Have you got any response?

https://mailman.ntg.nl/archives/list/[email protected]/thread/PGL63EF2LRXZYU44EVDMPGVS5KT4LN3G/

gucci-on-fleek avatar Aug 14 '25 23:08 gucci-on-fleek

Thanks, @gucci-on-fleek. Unlike on the dev-context list, there seems to be quite a discussion on the ntg-context list. I wasn't subscribed to that list but I am now.

I see a general push-bash, with expl3 being seen as against the spirit of ConTeXt and more general solutions like Pandoc being recommended. Adding expl3-generic to a ConTeXt format is seen as a poor idea, because it may break ConTeXt internals. Instead, the Markdown package module for ConTeXt should apparently drop the expl3 requirement and use the Lua interface directly, like it did before v2.13.0.

Anyone can write a barebones example file for ConTeXt that uses the Lua interface directly and circumvents the plain TeX + expl3 interface, similar to the example file for OpTeX and your example file for ConTeXt and an earlier version of the Markdown package, but I am not really interested in doing that. I believe that expl3 is a significant step forward in terms of TeX programming whose impact goes beyond just LaTeX and significant effort has been made towards making it format-agnostic. Hans and others may think that expl3 doesn't belong in ConTeXt but that doesn't automatically make it true, that's the beauty of free open source software.

Witiko avatar Aug 15 '25 08:08 Witiko

Note that most of the load time for expl3 is reading the Unicode files - in Lua(Meta)TeX that could likely be made a lot faster if the data were stored directly in Lua, but the work would need to be done - and for LaTeX that's not a priority as there is no gain. If there is a real need to, I can look at it but I'm not sure how many users are really impacted.

josephwright avatar Aug 15 '25 08:08 josephwright