lambdacube-gl icon indicating copy to clipboard operation
lambdacube-gl copied to clipboard

A mysterious crash

Open deepfire opened this issue 8 years ago • 18 comments
trafficstars

Circumstances:

  • reproducibility: good
  • commit: last hackage release
  • scenario: repeated calls to:
  dGPUMesh      ← GL.uploadMeshToGPU dMesh
  dGLObject     ← GL.addMeshToObjectArray osStorage (fromOANS osObjArray) [unameStr osUniform, "viewProj"] dGPUMesh
  dTexture      ← uploadTexture2DToGPU'''' False False False False $ (fromWi dStridePixels, h, GL_BGRA, pixels) -- a slightly hacked up version of uploadTexture2DToGPU
  GL.updateObjectUniforms dGLObject $ do
    fromUNS osUniform GL.@= return dTexture
  • stack is always the same:
Thread 4 "ghc_worker" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fb511270700 (LWP 11780)]
0x00007fb518faecac in __memmove_sse2_unaligned_erms ()
   from /nix/store/33f49v0xhayfnj6ldk6nzqbw2hlvcrix-glibc-2.24/lib/libc.so.6
(gdb) bt
#0  0x00007fb518faecac in __memmove_sse2_unaligned_erms ()
   from /nix/store/33f49v0xhayfnj6ldk6nzqbw2hlvcrix-glibc-2.24/lib/libc.so.6
#1  0x00007fb4cbabab0f in copy_array_to_vbo_array.isra ()
   from /run/opengl-driver/lib/dri/i965_dri.so
#2  0x00007fb4cbabb3b7 in brw_prepare_vertices () from /run/opengl-driver/lib/dri/i965_dri.so
#3  0x00007fb4cbabb986 in brw_emit_vertices () from /run/opengl-driver/lib/dri/i965_dri.so
#4  0x00007fb4cbad13a1 in brw_upload_render_state ()
   from /run/opengl-driver/lib/dri/i965_dri.so
#5  0x00007fb4cbab9ce8 in brw_draw_prims () from /run/opengl-driver/lib/dri/i965_dri.so
#6  0x00007fb4cb8b5a9a in vbo_draw_arrays () from /run/opengl-driver/lib/dri/i965_dri.so
#7  0x00007fb4e43bf006 in lambdacubezmglzm0zi5zi2zi3zmIAoiOJ1mI2lDFzzf9HuJpO2_LambdaCubeziGLziBackend_renderSlot1_info ()
   from /nix/store/knad873vg45139r9mfy5zjnq7rz3kdlw-lambdacube-gl-0.5.2.3/lib/ghc-8.0.1/lambdacube-gl-0.5.2.3/libHSlambdacube-gl-0.5.2.3-IAoiOJ1mI2lDFzf9HuJpO2-ghc8.0.1.so
#8  0x00007fb49c055c70 in ?? ()
#9  0x0000000000000000 in ?? ()
(gdb) quit
A debugging session is active.

deepfire avatar Mar 07 '17 19:03 deepfire

Note: it's a degenerate case, so it's likely running out of resources.

What is interesting, though, is that it only seems to crash from under intero -- it runs fine for a quite while when ran as a compiled binary..

deepfire avatar Mar 07 '17 19:03 deepfire

Does the compiled version crash at all? I don't know how the interactive mode handles the threads and FFI.

csabahruska avatar Mar 07 '17 19:03 csabahruska

I haven't seen the compiled version crash.

However, running it under control of halive still leads to a crash.

deepfire avatar Mar 07 '17 20:03 deepfire

is this similar to this? http://stackoverflow.com/questions/34706215/program-crash-while-i-try-use-glfw-b-in-ghci

csabahruska avatar Mar 07 '17 22:03 csabahruska

No, this one doesn't crash intero for me -- even if I change it to:

import Graphics.UI.GLFW as GLFW

main = do
  GLFW.init
  GLFW.terminate
  main

deepfire avatar Mar 07 '17 22:03 deepfire

what if you do something with gl? i.e. https://github.com/bsl/GLFW-b/issues/53

csabahruska avatar Mar 07 '17 23:03 csabahruska

I have tried to stress that example as well, first by repeatedly running main and manually hitting M-<f4>, then by:

main :: IO ()
main = do
  GLFW.init >>= require "GLFW.initialize"
  window <- GLFW.createWindow 800 800 "EnvelopesGLFW.hs" Nothing Nothing >>= \ w ->
           case w of Just w -> return w; Nothing -> do GLFW.terminate; error "GLFW.createWindow"

  GLFW.makeContextCurrent (Just window)
  GLFW.swapInterval 1
  GL.clearColor $= Color4 0 0 0 1

  (width, height) <- GLFW.getFramebufferSize window
  GL.viewport $= (GL.Position 0 0, GL.Size (fromIntegral width) (fromIntegral height))
  GL.clear [ColorBuffer]
  GLFW.swapBuffers window
  GLFW.waitEvents

  GLFW.destroyWindow window
  GLFW.terminate
  main

..and yet it didn't manage to crash intero.

deepfire avatar Mar 07 '17 23:03 deepfire

In case of lambdacube, how many times do you execute those operations?

csabahruska avatar Mar 07 '17 23:03 csabahruska

It was a stupidly tight loop, running at framerate frequency.

deepfire avatar Mar 08 '17 00:03 deepfire

..at unconstrained framerate frequency, which meant it was running the FRP network & Lambdacube renderer as fast as possible -- sequentially, one after another.

deepfire avatar Mar 08 '17 00:03 deepfire

Can you count it?

csabahruska avatar Mar 08 '17 06:03 csabahruska

Working on it -- in a roundabout way, though..

deepfire avatar Mar 08 '17 20:03 deepfire

@csabahruska, I have isolated a minimal testcase in:

  • https://github.com/deepfire/mood/blob/master/Holostress.hs

New information:

  1. does crash both on an Intel and on a nVidia GPU
  2. does crash in compiled code! (see repro instructions at top of the file)

deepfire avatar Mar 10 '17 01:03 deepfire

Notably, this is a minimal testcase, which means that both:

        dGIC          ← GIC.Context <$> GI.newManagedPtr (F.castPtr $ GRC.unCairo dGRC) (return ())

..and..

        _ ← GIPC.createContext dGIC

are crucial for the repro.

This widens the scope of the question from just LambdaCube:

  1. it could be my incompetence in using pointer arithmetic (yet it only does crash after hundreds of iterations on an intel card in compiled code mode.. not sure what to make of it)
  2. it could be a problem in Graphics.Rendering.Cairo / GI.PangoCairo

deepfire avatar Mar 10 '17 02:03 deepfire

Huh.. I guess I've lost the repro on my nVidia-based system -- that, after having developed the whole repro on it. And a reboot didn't change much - the nVidia repro is lost.

Yet on an intel laptop it still reliably produces this (compiled binary):

[nix-shell:~/src/mood]$ git log -n1
commit f8fa409193069085f74652ce0581ca8ea4c9f232
Author: Kosyrev Serge <[email protected]>
Date:   Fri Mar 10 04:55:02 2017 +0300

    Holostress:  repro for https://github.com/lambdacube3d/lambdacube-gl/issues/9

[nix-shell:~/src/mood]$ ghc --make ./Holostress.hs
[1 of 1] Compiling Main             ( Holostress.hs, Holostress.o )
Linking Holostress ...

[nix-shell:~/src/mood]$ ./Holostress
-- allocating GPU pipeline (GL.allocRenderer)... 5.1ms
01234501234501234501234501234501234501234501234501234501234501234501234501234501234501234501234
50123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123
45012345012345012345012345012345012345012345012345012345012345012345012345012345012345012345012
...
3450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123450123Segmentation fault

deepfire avatar Mar 10 '17 02:03 deepfire

I wonder why would you upload the same geometry to the GPU in each frame? Why don't you reuse it? Of course the GPU will run out of memory sooner or later.

csabahruska avatar Mar 12 '17 11:03 csabahruska

Could be so.. but why wouldn't LambdaCube tell me so? : -)

deepfire avatar Mar 12 '17 12:03 deepfire

That's a point! :) It should. But not implemented yet. I put that on the todo list.

csabahruska avatar Mar 12 '17 12:03 csabahruska