luaparse icon indicating copy to clipboard operation
luaparse copied to clipboard

Lua 5.4 support

Open fstirlitz opened this issue 6 years ago • 7 comments

This issue tracks all changes required to support parsing Lua 5.4 code.

While PUC-Rio hasn't officially released Lua 5.4 yet, it already appears it is going to bring at least one syntactic innovation: attributes in local declarations, to implement immutable bindings and 'to-be-closed variables', i.e. a form of lexical cleanup/with statement/RAII. The EBNF is as follows:

	stat ::= local ‘<’ Name ‘>’ Name ‘=’ exp

I assume this particular syntax will change, as the current one does not permit declaring multiple attributes simultaneously, and there is no syntax for combining attributes with the local function construct.

fstirlitz avatar Aug 02 '19 12:08 fstirlitz

Hmm, there is one way to have a variable be close and const simultaneously…

local x <close> = ...
local x <const> = x

Kind of verbose, but works.

(The grammar was changed so that attribute follows the identifier.)

fstirlitz avatar Oct 05 '19 06:10 fstirlitz

Never mind, <close> is going to imply <const> anyway. So multiple attributes are a non-issue for now.

Another change is that labels from inner scopes will no longer be allowed to shadow those from outer scopes. So this:

do ::x:: do ::x:: end end

should fail to parse where it succeeded before. I'm not sure I like this change.

fstirlitz avatar Oct 22 '19 20:10 fstirlitz

Another change is that UTF-8 escapes now allow to specify code points up to U+7FFFFFFF, as in the 1993 definition of UTF-8 instead of the modern one. (Surrogate code points were already allowed in 5.3, so the UTF-8 encoding was really WTF-8 anyway. But I happened to allow those already as well.)

fstirlitz avatar Jan 16 '20 05:01 fstirlitz

Another change is that UTF-8 escapes now allow to specify code points up to U+7FFFFFFF, as in the 1993 definition of UTF-8 instead of the modern one.

How does that work in practice, given that Unicode's code point range ends at U+10FFFF? What does an escape sequence for U+7FFFFFFF evaluate to? U+FFFD?

mathiasbynens avatar Jan 16 '20 05:01 mathiasbynens

"\u{7FFFFFFF}" is simply the same string as "\xFD\xBF\xBF\xBF\xBF\xBF" in Lua 5.4. As for how it's supposed to be interpreted as a Unicode string, that is another matter; #68 has more details.

fstirlitz avatar Jan 16 '20 06:01 fstirlitz

I have added support for Lua 5.4 local attributes in my fork, though I don't feel confident making a pull request (haven't run tests, nor am I certain my implementation is the best way to handle this). The commit is here in case it is of any interest

https://github.com/TwoLivesLeft/luaparse/commit/1e88e5d2c04c83c5bc4a967c6002cc2f38a396e1

simsaens avatar Jan 21 '21 14:01 simsaens

@TwoLivesLeft No need for that, probably. I have the code already written in my private branch, all that remains is to write the tests.

fstirlitz avatar May 10 '21 18:05 fstirlitz