glsl-optimizer icon indicating copy to clipboard operation
glsl-optimizer copied to clipboard

Hangs in glsl_test and glsl_compiler

Open behdad opened this issue 10 years ago • 7 comments

When I build and try to run "glsl_test tests/" or glsl_compiler on my own shader, the thing keeps working and never finishes.

Sample:

$ ./glsl_test tests/

** running vertex tests for OpenGL ES 2.0...

top shows that glsl_test is consuming 100% CPU, but never returns. Memory usage is stable. Same with glsl_compiler.

behdad avatar Jan 14 '14 07:01 behdad

Here's a sample backtrace from glsl_test:

behdad:glsl-optimizer 130 (master)$ gdb --args ./glsl_test tests/ GNU gdb (GDB) 7.6-gg23 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux".

<http://go/gdb-home FAQ: http://go/gdb-faq Email: gdb-team IRC: gdb> Reading symbols from /home/behdad/src/github/glsl-optimizer/./glsl_test...done. (gdb) r Starting program: /home/behdad/src/github/glsl-optimizer/./glsl_test tests/

** running vertex tests for OpenGL ES 2.0... ^C Program received signal SIGINT, Interrupt. ir_swizzle::accept (this=0x956268, v=0x7fffffffdbb8) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:226 226 { (gdb) bt #0 ir_swizzle::accept (this=0x956268, v=0x7fffffffdbb8) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:226 #1 0x000000000044d51d in ir_assignment::accept (this=0x957248, v=0x7fffffffdbb8) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:291 #2 0x000000000044d06d in visit_list_elements (v=0x7fffffffdbb8, l=, statement_list=true) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:56 #3 0x000000000044d169 in ir_function_signature::accept (this=0x9455b8, v=0x7fffffffdbb8) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:116 #4 0x000000000044d06d in visit_list_elements (v=0x7fffffffdbb8, l=, statement_list=false) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:56 #5 0x000000000044d1b9 in ir_function::accept (this=0x945508, v=0x7fffffffdbb8) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:128 #6 0x000000000044d06d in visit_list_elements (v=0x7fffffffdbb8, l=, statement_list=true) at /home/behdad/src/github/glsl-optimizer/src/glsl/ir_hv_accept.cpp:56 #7 0x0000000000462185 in do_algebraic (instructions=0x934028) at /home/behdad/src/github/glsl-optimizer/src/glsl/opt_algebraic.cpp:500 #8 0x000000000041f9f4 in do_optimization_passes (linked=true, ir=0x934028, state=, mem_ctx=) at /home/behdad/src/github/glsl-optimizer/src/glsl/glsl_optimizer.cpp:328 #9 glslopt_optimize (ctx=, type=, shaderSource=0x6cc0c8 "\nattribute highp vec4 _glesVertex;\n\nattribute mediump vec3 _glesNormal;\n\nattribute highp vec4 _glesMultiTexCoord0;\n\nattribute highp vec4 _glesMultiTexCoord1;\n\nattribute lowp vec4 _glesColor;\nuniform m"..., options=) at /home/behdad/src/github/glsl-optimizer/src/glsl/glsl_optimizer.cpp:442 #10 0x000000000041da11 in TestFile (gles=, outputPath="tests//vertex/opt-matrix-transpose-mul-outES.txt", hirPath="tests//vertex/opt-matrix-transpose-mul-irES.txt", inputPath="tests//vertex/opt-matrix-transpose-mul-inES.txt", testName="opt-matrix-transpose-mul-inES.txt", vertex=, ctx=0x6b5490, doCheckGLSL=) at /home/behdad/src/github/glsl-optimizer/tests/glsl_optimizer_tests.cpp:407 #11 main (argc=, argv=) at /home/behdad/src/github/glsl-optimizer/tests/glsl_optimizer_tests.cpp:523

behdad avatar Jan 14 '14 07:01 behdad

I am getting this same hanging behavior in Linux with the current HEAD of master. The odd thing is that through bisecting, I see successful test behavior in changeset 8828115, which was committed October 9, 2014, 10 months after this issue was submitted. So something was broken, then fixed again, then broken again?

neomantra avatar Mar 18 '15 19:03 neomantra

I have explored this further.... all of this is with NDEBUG undefined. tests pass in debug builds (-O0 and no NDEBUG) and also in -O1. With -Os, the project's current release config, they hang similar to @behdad 's comment. With -O2, I get a segfault. I am getting this with gcc 4.6.3 on Ubuntu 12.04 and gcc 4.8.2 on Ubuntu 14.04.

I instrumented glsl to print the input names it was failing on... it was 87 of them, but not all. I looked in a simple one, tests/vertex/zun-Test_CgNormals-in.txt. In mucking around with it, I found I could get it to pass by changing this:

mat3 xll_constructMat3( mat4 m) {
// FAILS  return mat3( vec3( m[0]), vec3( m[1]), vec3( m[2]));
// FAILS  return mat3( vec3( m[0]), vec3( m[1]), vec3( m[2]));
// THIS PASSES
    return mat3( vec3(1,1,1), vec3(1,1,1), vec3(1,1,1));
}

Beyond simple build / platform issues, I'm not familiar with the glsl-optimizer code So I'm not sure where to go from here... The existing code is probably depending on a particular default initialization or something, and is getting lucky with how clang or MSVC deals with it... or gcc has some problem?

in the meantime, it might be best to make the regular release be unoptimized, since this is generally an offline tool? The 485 tests on my MacBook Air pass in 3.9 seconds in -Os versus 4.98 seconds in -O0... My concern is that maybe the unit tests just aren't catching this issue on other platforms.

neomantra avatar Mar 18 '15 21:03 neomantra

It was a deep dive down a rabbit hole, but it ended up being the "optimize_split_arrays" optimization step that was causing the hang. After lots more digging, it came that an over-aggressively optimized-out loop was causing the infinite loop. https://github.com/neomantra/glsl-optimizer/blob/9bffe9de3d58912edc1766b7d3da8780540c3d91/src/glsl/opt_array_splitting.cpp

So maybe it was a gcc problem after all? [I'm definitely not a fan of the mesa containers though.] I submitted a changeset that fixes it.

neomantra avatar Mar 19 '15 07:03 neomantra

Note @tschw found what is probably the correct solution in pull request #104, commented on in #91

neomantra avatar Jul 13 '15 13:07 neomantra

Let's say "it attempts to solve the root of the problem rather than its symptoms". I'm not sure it's correctly correct, but it at least keeps my version of GCC from breaking the code when optimizing.

I sent a more thorough patch upstream, see https://bugs.freedesktop.org/show_bug.cgi?id=91320 for details.

tschw avatar Jul 13 '15 15:07 tschw

I just had the same issue, would it be possible to merge the upstream patch ? https://marc.info/?l=mesa3d-dev&m=146706386101865

Lectem avatar Jan 09 '18 20:01 Lectem