pandoc
pandoc copied to clipboard
Lua: let List constructor split string argument on whitespace
Describe your proposed improvement and the problem it solves.
A little more DWIMery in the Lua API: let the List constructor when given a string as argument split that string on whitespace, so that for example List("foo bar baz") returns a three-element list {"foo", "bar", "baz"}.
This is useful in the (to me at least) common case where you want to create a list of classes, or when you want to loop over a set list of strings, e.g. e.g. keys of metadata fields to validate.
I'm the first to admit that this is very lazy,[^1] but the class field in a table passed to pandoc.Attr already works like this
Describe alternatives you've considered.
pandoc.Attr({class = string}).classes
works so-so if you already have a string in a variable.
Currently I'm using a function which uses string:gmatch('%S+') but complete with checking whether the argument is already table-ish and the loop around gmatch^2 that's quite a bit of boilerplate in almost every filter I write. (I do have my own utilities library, but when I may be going to share the filter with others I always end up copying the functions I use into the filter file!)
[^1]: I also admit that I'm missing Perl's @array = qw/foo bar baz/ operator and @array = $string =~ /\S+/g construct!
helper.to_list = function(val, pat) if 'table' ~= type(val) then local str = tostring(val) pat = tostring(pat or '%S+') val = { } for s in str:gmatch(pat) do val[#val + 1] = s end end return pandoc.List(val) end `````` As you can see this function does a bit more in that it allows a custom pattern but I'm not asking for that!
I wouldn't want to modify pandoc.List, but I'd be open to towards adding a pandoc.text.split function, e.g., by wrapping Data.Text.splitOn.
Alternative: we could also make it easier to turn the gmatch iterator into a list:
pandoc.List.from_iterator(str:gmatch '%S+')
That could also be used with other iterators like io.lines, file:lines, pairs, etc.
local my_keys = pandoc.List.from_iterator(pairs(my_table))
We could probably overload pandoc.List for extra convenience.
pandoc.List(str:gmatch '%S+')
Great! 👍The Penlight List class constructor does something similar by calling its iterator generator on anything which isn't a table. That can actually be nasty since a string becomes a list of bytes, which is easy to forget. Taking an iterator as argument is much better!