gz-rendering
gz-rendering copied to clipboard
:farmer: `SEH Exceptions` in gz-rendering-win
Environment
- OS Version:
- Windows
- Source build
- gz-rendering7, gz-rendering8, gz-rendering9, gz-rendering-main
Description
- Expected behavior: All test passing
- Actual behavior: 71 tests failing
Steps to reproduce
- Run a build on gz-rendering7-win, gz-rendering8-win or ign_rendering-ci-win
- See it fail
Output
We have 71 test regressions in gz-rendering windows (7, 8, 9 and main)
First time we saw it: Sept 1 It stopped between Sept 28 and Oct 11
And now it came back
We think it's probably related to this issue: https://stackoverflow.com/questions/13157671/seh-exception-with-code-0xc0000005-thrown-in-the-test-body
Update (17/08/2023):
Reference build: https://build.osrfoundation.org/job/ign_rendering-ci-win/254/#showFailuresLink
Test regressions (71 last updated in 13-09-2024)
- CameraTest.ViewProjectionMatrix
- CameraTest.RenderTexture
- CameraTest.TrackFollow
- CameraTest.AddRemoveRenderPass
- CameraTest.VisibilityMask
- CameraTest.IntrinsicMatrix
- GpuRaysTest.Configure
- LidarVisualTest.Configure
- LidarVisualTest.LidarVisual
- MeshTest.NormalMapWithoutTexCoord
- MeshTest.MeshSubMesh
- MeshTest.MeshSkeleton
- MeshTest.MeshSkeletonAnimation
- MeshTest.MeshClone
- ProjectorTest.Projector
- SceneTest.Scene
- SceneTest.SceneGradient
- SceneTest.Nodes
- SceneTest.RemoveNodes
- SceneTest.DestroyNodes
- SceneTest.NodeCycle
- SceneTest.Materials
- SceneTest.Time
- SceneTest.BackgroundMaterial
- ThermalCameraTest.ThermalCamera
- WavesTest.Waves
- ReloadEngineTest.Scene
- ArrowVisualTest.ArrowVisual
- AxisVisualTest.AxisVisual
- COMVisualTest.COMVisual
- CapsuleTest.Capsule
- GizmoVisualTest.GizmoVisual
- GizmoVisualTest.Material
- GizmoVisualTest.LookAt
- GridTest.Grid
- InertiaVisualTest.InertiaVisual
- LightVisualTest.LightVisual
- LightTest.Light
- MarkerTest.Marker
- MaterialTest.MaterialProperties
- MeshDescriptorTest.Descriptor
- MoveToHelperTest.MoveTo
- NodeTest.Pose
- OrbitViewControllerTest.OrbitViewControl
- OrbitViewControllerTest.Constructor
- OrbitViewControllerTest.Control
- OrthoViewControllerTest.OrthoViewControl
- OrthoViewControllerTest.Control
- ParticleEmitterTest.ParticleEmitter
- RayQueryTest.RayQuery
- RenderTargetTest.RenderTexture
- RenderTargetTest.RenderWindow
- RenderTargetTest.AddRemoveRenderPass
- RenderingIfaceTest.GetEngine
- TextTest.Text
- FontTestInstantiation/FontTest.SupportedFont/0
- FontTestInstantiation/FontTest.SupportedFont/1
- FontTestInstantiation/FontTest.SupportedFont/2
- TransformControllerTest.TransformControl
- TransformControllerTest.WorldSpace
- TransformControllerTest.LocalSpace
- TransformControllerTest.Control2d
- VisualTest.Material
- VisualTest.Children
- VisualTest.Scale
- VisualTest.UserData
- VisualTest.Geometry
- VisualTest.VisibilityFlags
- VisualTest.BoundingBox
- VisualTest.Wireframe
- VisualTest.Clone
Log Output:
17: [Err] [C:\J\workspace\ign_rendering-ci-win\ws\gz-rendering\ogre\src\OgreRenderEngine.cc:695] [m[31mUnable to create the rendering window. Attempt 9. Exception Ogre::RenderingAPIException::RenderingAPIException: OpenGL 1.5 is not supported in GLRenderSystem::initialiseContext at C:\vcpkg\buildtrees\ogre\src\eddf310f0b-6ab1152694.clean\RenderSystems\GL\src\OgreGLRenderSystem.cpp (line 1167)
17: [Err] [C:\J\workspace\ign_rendering-ci-win\ws\gz-rendering\ogre\src\OgreRenderEngine.cc:704] Unable to create the rendering window after 10 attempts.
17: [Err] [C:\J\workspace\ign_rendering-ci-win\ws\gz-rendering\ogre\src\OgreRenderEngine.cc:638] Failed to create dummy render window.
17: [Err] [C:\J\workspace\ign_rendering-ci-win\ws\gz-rendering\ogre\src\OgreRenderEngine.cc:737] Failed to get capabilities
17: [Wrn] [C:\J\workspace\ign_rendering-ci-win\ws\gz-rendering\ogre\src\OgreRenderEngine.cc:797] Cannot initialize render engine since render path type is NONE. Ignore this warning if rendering has been turned off on purpose.
unknown file
SEH exception with code 0xc0000005 thrown in the test body.
This issue started happening again on Aug 11 on different machines
| job_name | last_fail | first_fail | build_count | failure_count | failure_percentage |
|---|---|---|---|---|---|
| ign_rendering-gz-7-win | 2023-08-15 | 2023-08-11 | 10 | 3 | 30.0 |
| ign_rendering-ci-win | 2023-08-15 | 2023-08-12 | 9 | 2 | 22.22 |
Edit 17-10-2023:
New test regression (new test added by #908):
This is now a consistent failure in gz-rendering 7 and 8 Windows
Edit 12-08-2024:
New test regression caused by #976 5 months ago:
Edit 13-10-2024:
New test added in #1035 are failing because of this:
is this still an issue?
I updated the issue with new information
The SEH exception might be a red herring. Based on the error
11: [m[31m[Err] [C:\J\workspace\ign_rendering-gz-7-win\ws\gz-rendering\ogre\src\OgreRenderEngine.cc:695] [m[31mUnable to create the rendering window. Attempt [[m[31m0[m[31m]. Exception [[m[31mOgre::RenderingAPIException::RenderingAPIException: OpenGL 1.5 is not supported in GLRenderSystem::initialiseContext at C:\vcpkg\buildtrees\ogre\src\eddf310f0b-6ab1152694.clean\RenderSystems\GL\src\OgreGLRenderSystem.cpp (line 1167)[m[31m][m[31m
my guess is that the graphics card is not set up properly on that machine. The same test doesn't print that error on a passing test on a different machine (see INTEGRATION_boundingbox_camera_ogre_gl3plus (test 11) on https://build.osrfoundation.org/job/ign_rendering-gz-7-win/79/consoleFull#console-section-12).
Yeah, the error is probably likely that something in the scene creation process is an unchecked nullptr. For the most part, you can think of the SEH 0xc0000005 as the Windows equivalent to a segfault. I believe that Windows calls it an access violation, which means that a pointer is not what it should be.
This is still affecting gz-rendering windows jobs with the same output
Reference builds:
- https://build.osrfoundation.org/job/gz_rendering-main-win/36/
- https://build.osrfoundation.org/job/gz_rendering-7-win/38/
- https://build.osrfoundation.org/job/gz_rendering-8-win/37/
Failure Percentage:
| job_name | last_fail | first_fail | build_count | failure_count | failure_percentage |
|---|---|---|---|---|---|
| gz_rendering-main-win | 2024-04-08 | 2024-03-12 | 8 | 8 | 100.0 |
| gz_rendering-7-win | 2024-04-07 | 2024-03-10 | 8 | 8 | 100.0 |
| gz_rendering-8-win | 2024-04-04 | 2024-03-09 | 8 | 8 | 100.0 |
@Crola1702 any progress on setting up ogre 2.3 on our windows machines? I believe that's a requirement for fixing these tests.
@Crola1702 any progress on setting up ogre 2.3 on our windows machines
I'm not sure. I can take a look with @j-rivero's help. Any ideas on where to start?
Last time we talked about this, I remember the next action item was to check if our vcpkg port for ogre-next-23 actually works. If you can verify that locally, I think we can make the change in https://github.com/gazebo-tooling/release-tools/blob/f392d30813b5097229b22413f56a6733556d34f4/jenkins-scripts/lib/windows_env_vars.bat#L22
I did a quick test adding ogre-next-23 dependency. First it failed because it tires to install files in a path that conflicts with ogre. I then removed ogre, ogre2, ogre22 from the dependency list (see ogre23 branch in release-tools) and installation was successful, however, gz-rendering failed to find ogre-next when running cmake.
So I think there are a couple to tasks to do for getting ogre-next 2.3 working on windows:
- Make
ogre-next-23side-by-side installable withogre - Either update
ogre-next-23vcpkg to install files in a way that's consistent withogre22(so that gz-rendering can find it) or updateFindGzOgre2.cmaketo find this package while making sure not to break usage on conda.