godot icon indicating copy to clipboard operation
godot copied to clipboard

GDShader preprocessor evaluates weird functions in `#if` condition expressions

Open geekley opened this issue 1 year ago • 3 comments

Tested versions

  • Reproducible in v4.3.stable.flathub [77dcf97d82cbfe4e4615475fa52ca03da645dbd8]

System information

Godot v4.3.stable (77dcf97d8) - Freedesktop SDK 23.08 (Flatpak runtime) - X11 - Vulkan (Forward+) - integrated Intel(R) HD Graphics 5500 (BDW GT2) - Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz (4 Threads)

Issue description

In GDShader #if condition expressions, you can use several operators (more than I expected it to allow) and even functions (e.g. math like sqrt, sin), which will be evaluated in the preprocessor. Which ones are undocumented.

At first I expected those to at least match the GDShader functions and operator syntax. But I found it's not always the case (e.g. ternary ? : doesn't work, inversesqrt(...) doesn't either). After some more tests (see https://github.com/godotengine/godot/issues/96253#issuecomment-2318882070 for background), I found it's accepting functions in @GlobalScope as well as some constructors like bool etc. It allows even functions very weird to allow, like randf, instance_from_id and rid_from_int64. Allowing even stuff like bytes_to_var_with_objects seems like it could be particularly dangerous (ACE risk?). But I don't know for sure to which extent it allows functions with side-effects and whether it affects the editor.

In any case, I'm surprised things like functions work at all. I was expecting that dealing with integers and booleans would be enough for a preprocessor. In fact, if the intention is to match C/GLSL, then GDShader is doing way more than it should in the preprocessor #if directives. Not even C deals with float or boolean constants in the preprocessor, let alone math functions. GLSL doesn't either. They do handle integers without the u suffix (so no uint support). C does handle arbitrary names, treating them like 0 (even true is treated like 0), but GLSL doesn't allow them on #if. GDShader allows most literals (bool, int_decimal, int_hex, uint_decimal, float), except for uint_hex like 0x0u.

Comments from @pirey0:

7 - function calls in #if directives: Really interesting stuff! Looks like it could be very useful and at the same time very dangerous. Again probably a topic for a rendering meeting (unless this was already discussed in the past.)

I brought up this Issue in the 2024-09-03 Rendering Meeting, these are the key points:

  • Aim for glsl-like behavior, so that shaders can be ported over very easily
  • Do not bloat the preprocessor codebase
  • May need to scope back some of the eval() stuff in #if directives

IMO, GDShader should match GLSL, and be as safe as possible:

  • only deterministic behavior must be allowed
  • no side-effects
  • no preprocessor functions, only macros
  • no dealing with float at all; no boolean literals either
    • there's no need to extrapolate C/GLSL behavior
  • keep current GLSL-like (not C-like) behavior of not expanding undefined macro names as 0
    • so even though true isn't a keyword, it won't expand to 0 by default; raise "undefined macro" error instead
  • only integer literals should be allowed, as well as macros that evaluate to integers;
    • like expected in C and GLSL, have 0 mean false and non-zero mean true when on the context of boolean operators (like && and ||) and the final #if expression boolean result
  • no arbitrary expressions, only allow:
    • int32 literals (decimal and hex) without u suffix
    • parentheses
    • macro expansions (simple and function-like calls)
    • logical and arithmetic operators to deal with integers as said above
    • the defined(macro_name) (also defined macro_name) preprocessor keyword is the only "function" allowed

A lot of these behaviors are explicitly defined in the GLSL ES 3.00 spec pages 13~14.


Moved from #96253 (sub-issue 7).

Steps to reproduce

bug-global-fn-evaluation.gdshader

shader_type spatial;

#if (sin(0) + typeof(print(rid_allocate_id())) - typeof(print(bytes_to_var_with_objects([]))))
// can typing here have editor side-effects? it prints to console at least...
// bytes_to_var_with_objects = arbitrary code execution risk? no idea
code, etc
#endif

#if randf() < 0.5
I have heard of non deterministic compilation but this is crazy
// 50% chance of this code being included every time I type
#endif

$ // error on purpose to log preprocessor output

Minimal reproduction project (MRP)

N/A

geekley avatar Sep 03 '24 00:09 geekley

I didn't check, but I assume current implementation may be using something like Expression? I believe #93822 may be useful for this, if I understand correctly?

geekley avatar Sep 21 '24 21:09 geekley

Another issue related to #if expression evaluation (that would need fixing in the new implementation) is shown here. The operators && and || should be short-circuit operators that may not evaluate the next part. However, macro names that are not defined still generate an error even if trying to precede them with a defined guard.

bug-should-short-circuit.gdshader

//shader_type spatial;

#if defined(foo) && foo
const int one = 1;
#endif

#if !defined(foo) || foo
const int two = 2;
#endif

cd /tmp
nano a.c # edit the files
cat a.c > a.frag; echo '=== GLSL ==='; glslang -E a.frag; echo '=== C ==='; gcc -E a.c

GLSL (and C of course) allow it, deactivating the first #if branch, resulting in:

const int two = 2;

GDShader raises "Condition evaluation error" on both #if directives. It should match GLSL behavior instead.

You can tell the operators have short-circuit semantics because without the preceding defined guards, GLSL (which unlike C doesn't interpret undefined macros as 0) complains: 'preprocessor evaluation' : undefined macro in expression not allowed in es profile foo

But note that it seems only parameter-less macros are "syntactically" allowed after expansion. If the right part is foo(x) but a foo macro is not defined, C and GLSL both raise a "syntax" error on the expression: GLSL: '#if' : unexpected tokens following directive C: error: missing binary operator before token "("

So this is what I think GLSL is doing:

  1. "expand" defined(...) keywords first (e.g. to " 0 " or " 1 ")
  2. then expand macros that are defined (with or without parameters)
  3. parse expression syntax; abort on syntax error
    • remaining identifiers represent undefined macros, so they're allowed syntax
  4. evaluate expression tree, respecting short-circuit order, parentheses and operator precedence
    • if short-circuit operators don't exit early, identifiers raise evaluation error (abort in this case)
  5. use the resulting value (as a boolean) to perform the branching

geekley avatar Sep 25 '24 22:09 geekley

Just found out about the defined macro_name syntax (without parentheses). It's supported (like in C and GLSL), except it's a bit buggy in the current implementation.

//shader_type spatial;
#if !defined apple && defined banana
const int zero = 0;
#elif !defined apple && !defined banana
const int one = 1;
#endif

C and GLSL accept it just fine. GDShader raises "Condition evaluation error".

geekley avatar Oct 03 '24 00:10 geekley