garrysmod-issues icon indicating copy to clipboard operation
garrysmod-issues copied to clipboard

CompileString/RunString/Script errors upon a NUL char

Open Cheatoid opened this issue 4 years ago • 7 comments

Details

As the title says.

Steps to reproduce

  1. Download the example script.
  2. Run the script either via CompileString/CompileFile/RunString/lua_openscript...
  3. Observe the script error: unfinished long string near '<eof>'

My hypothesis that this is because of all C API (luashared/engine/luajit) are using const char* all over the place, gmod ain't compiled with UNICODE switch therefore the "code" ends up terminated upon a NUL char.

This is pretty annoying issue for me because I want to embed the binary data within code without having to escape NULs as \000... ([ The whole point being to keep the file size as small as possible. If I were to escape NULs, doing so might negatively impact file size and defeat the point, as it would be inserting 3 extra bytes for every NUL escape... ])

The contents of the example script (click to see)

image

Cheatoid avatar Feb 05 '21 13:02 Cheatoid

This is actually to prevent bytecode from being loaded. It's a bad hack that needs to be solved properly.

Kefta avatar Feb 05 '21 15:02 Kefta

This is actually to prevent bytecode from being loaded

Nah, it's just not handled properly. At least with CompileString proper bytecode will still not load, I have tested it.

robotboy655 avatar Feb 05 '21 16:02 robotboy655

...I thought bytecode-loading was completely stripped rather than hacked-away... 😆

In my testing, RunString/CompileString/lua_openscript/whatever wrappers, it all boils down to this ultimate function: lua_load (or rather luaL_loadbufferx/luaL_loadbuffer, which is sometimes inlined...)

The clue is on the docs: https://www.lua.org/manual/5.4/manual.html#lua_load

lua_load automatically detects whether the chunk is text or binary and loads it accordingly. The string mode works as in function load, with the addition that a NULL value is equivalent to the string "bt".

The value of the mode always seems to be NULL (0) in gmod, which means it will try to automatically detect the mode. So... https://github.com/LuaJIT/LuaJIT/blob/ec6edc5c39c25e4eb3fca51b753f9995e97215da/src/lj_load.c#L37 Get rid of the automatic-mode-detection part? Fully strip the bytecode-loading capability? And possibly, always provide "t" string as the mode parameter value? (notice no b) (not sure if any of this is the correct approach, just throwing some ideas; perhaps do all of them? 😄)

Then it is just a matter of providing the correct value to the sz parameter to fix this issue, I believe. Thanks

Cheatoid avatar Feb 05 '21 17:02 Cheatoid

As a temporary solution, you can use '\0' which is 2 bytes instead of 1

ThatLing avatar Feb 13 '21 18:02 ThatLing

That's the actual solution rather

thegrb93 avatar Feb 13 '21 18:02 thegrb93

Smarty... Think again. It is not 2 bytes. I have to use 4 bytes. Because what if there are digits ahead of the \0 Guess what, it will treat them as part of the escape. Example: ..... \04rekt ..... The only time you can just use \0 is when you know that there are no digits ahead of it. Hence why I'd have to do \000 to ensure proper NUL escaping. Which means 3 extra bytes per NUL escape. Simple math.

Cheatoid avatar Feb 13 '21 19:02 Cheatoid

Moreover you can't use escaping in long string [[ long string ]] at all. Instead you must use normal double/single-quote string to embed such binary data. Which means you'd also have to escape those: " \ \n \r ... which is terrible (each of those escapes would add extra byte cuz inserting \), but it is a temporary solution.


This isn't a really huge issue in my particular case, but it is still really annoying (mainly because I have to run post-processor for escaping the above mentioned chars, instead of just streaming the binary data into a file). For someone else with many NUL/newlines/quotes in their binary data, the temporary solution can result in rapidly bumping up the size.


I have thought this through entirely and I am currently using a temporary solution which I've explained here, so there's no point in throwing useless ideas in here. Thanks. P.S. The binary data in my particular case is the LZMA compressed string (see util.Compress), just in case you wanted to know... P.P.S Another solution one may be able to use is to write a binary data into a file and then file.Read that. (However I can't do that in my case cuz I am targeting Starfall environment here...)

Cheatoid avatar Feb 13 '21 20:02 Cheatoid