leocad icon indicating copy to clipboard operation
leocad copied to clipboard

CLI LeoCAD depends on display server availability

Open nathaneltitane opened this issue 5 years ago • 29 comments

When attempting to run LeoCAD's cli only (ex.: for rendering a model only), one cannot do so as a true CLI command as a dsplay server seems to be required for LeoCAD to function.

Usage case: running LeoCAD through ssh to render model via CLI.

Can LeoCAD be made GUI agnostic or independent at the CLI level to permit such behavior?

nathaneltitane avatar Apr 01 '19 19:04 nathaneltitane

What error do you get? Have you tried -platform offscreen or -platform eglfs?

leozide avatar Apr 01 '19 22:04 leozide

Through ssh, a simple help command gives this:

_ $ leocad --help leocad: cannot connect to X server

Using the -platform options gives the same error message...

I should be able to have leocad run a command orshow its CLI options regardless of X being connected or not.

nathaneltitane avatar Apr 02 '19 14:04 nathaneltitane

Using -platform offscreen was crashing but I fixed it and now it works for me.

leozide avatar Apr 02 '19 18:04 leozide

Rebuilt leocad from latest source, still getting the same output:

_ $ leocad -platform offscreen --help leocad: cannot connect to X server

nathaneltitane avatar Apr 03 '19 15:04 nathaneltitane

This is a Qt issue. Do you have /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqoffscreen.so? You can set QT_DEBUG_PLUGINS=1 to see what's going on.

leozide avatar Apr 03 '19 18:04 leozide

Is this fixed now?

leozide avatar Jan 07 '21 01:01 leozide

Will check in a bit. Should be good with today's fixes tbh.

nathaneltitane avatar Jan 07 '21 02:01 nathaneltitane

@leozide

as of the last few continuous builds this problem has shown up again

leocad complains of the absence of display server and does not operate under cli only when rendering through a script or via direct command input on terminal emulator.

nathaneltitane avatar Mar 15 '21 12:03 nathaneltitane

I am seeing similar behaviour where all my AppVeyor headless tests are failing due to an abnormal end originating from my LCLib (LeoCAD Library). This has been the case since commits 233affe3fcdc851fa82cb058871bddd0046e1c87 and 36023ee2e47d0f11d2a28349a8a05a6a85478757. The Qt offscreen renderer seems problematic.

Cheers,

trevorsandy avatar Mar 15 '21 17:03 trevorsandy

Is there a call stack showing where this happens?

leozide avatar Mar 16 '21 03:03 leozide

I’ll run an interactive session on the next AppVeyor build and capture the dump output file. I believe the problem, for my scenario, lies with the fact that the free access AV VMs only support OpenGL 1.1. See this post.

Update: LeoCAD uses the QtANGLE (DirectX) on AppVeyor as its Windows OpenGL version is less than 2.0. As I am using qt::AA_UseDesktopOpenGL in LPub3D, my config will not use the DirectX translation so my abnormal end is likely due to unsupported OpenGL calls in the current LeoCAD config - i.e. offscreen renderer, shared context etc...

Cheers,

trevorsandy avatar Mar 16 '21 05:03 trevorsandy

@trevorsandy , @leozide Can something be done about that dependency? I know LDView, for example has a completely separate binary (ldview-osmesa) that runs cli only without a hitch

nathaneltitane avatar Mar 16 '21 13:03 nathaneltitane

For me, the better solution would be to update OpenGL on the AV VM before running my tests. For example, build and store the MSVC OSMesa lib and download before running my tests.

LDView uses direct OpenGL calls and the WGL API interface for OpenGL on Windows and is not a Qt application. It does not require an OpenGL version greater than 1.1 on Windows.

LeoCAD however has recently switched to QOpenGLWidget (see 4e6cbca31c5260ca04619fc1d9d5c95a55e0b70d) and has abandoned direct OpenGL calls (see b8a8cb6730e367ecdd3e870a4305fb9e646b822c).

Cheers,

trevorsandy avatar Mar 16 '21 16:03 trevorsandy

Surprisingly, I am able to produce an abnormal end on a locally compiled release build of commit b548e1f4d24437e5fe062ecc638fea5686625e2a by simply running leocad --help.

The following shots speak for themselves. I've also attached an archive of the exe, pdb, and a dump file in the abend_dump_file subfolder.

Screenshot - 20_03_2021 , 05_20_13

Screenshot - 20_03_2021 , 05_24_59

Screenshot - 20_03_2021 , 05_23_28

leoCAD_release_build.zip

KO code:

void lcContext::ShutdownRenderer()
{
	mGlobalOffscreenContext->MakeCurrent();
	lcContext* Context = mGlobalOffscreenContext.get();

	gStringCache.Reset();
	gTexFont.Reset();

	lcView::DestroyResources(Context);
	Context->DestroyResources();
	lcViewSphere::DestroyResources(Context);

	mGlobalOffscreenContext.reset();

	lcContext::DestroyOffscreenContext();
}

OK code:

void lcContext::ShutdownRenderer()
{
	gStringCache.Reset();
	gTexFont.Reset();

	lcContext* Context = mGlobalOffscreenContext.get();
	if (!Context)
		return;

	mGlobalOffscreenContext->MakeCurrent();

	lcView::DestroyResources(Context);
	Context->DestroyResources();
	lcViewSphere::DestroyResources(Context);

	mGlobalOffscreenContext.reset();

	lcContext::DestroyOffscreenContext();
}

Cheers,

trevorsandy avatar Mar 20 '21 04:03 trevorsandy

I duped my LPub3D AppVeyor build locally and replicated the abnormal end I am now seeing with all (Uinux/Windows) builds when my lcLib (LeoCAD) library is set as the preferred renderer.

This could be related to the command line abnormal end that triggers the 'cannot connect to X server' message but I must state that my lcLib library is heavily modified so this behaviour could also be of my doing.

Anyway, here is the dump file trace. I've also attached the dump file if you care to take a look.

Screenshot - 20_03_2021 , 20_00_45

LPub3D.zip

On Appveyor, you can add a quick build check test, e.g. leocad -i testout.png -t 1 -f 2 ldraw\models\pyramid.ldr and the following update to generate content for debugging...

Update this block in static LONG WINAPI lcSehHandler(PEXCEPTION_POINTERS exceptionPointers) to see exactly where the dump file is deposited and then you can even grab it if it exists and post it to artifacts for review. You'll have to detect if running in console mode (using the command line arguments) and set gConsoleMode.

...
      if (writeDump)
      {
          TCHAR message[_MAX_PATH + 256];
          lstrcpy(message, TEXT(VER_PRODUCTNAME_STR " crashed. Crash information was saved to '"));
          lstrcat(message, gMinidumpPath);
          lstrcat(message, TEXT("', please send it to the developer for debugging."));

          if (gConsoleMode)
          {
              fprintf(stdout, "%ls\n",message);
              fflush(stdout);
          }
          else
          {
              MessageBox(nullptr, message, TEXT(VER_PRODUCTNAME_STR), MB_OK);
          }
      }
...

Cheers,

trevorsandy avatar Mar 20 '21 19:03 trevorsandy

I tried opening the dump but I don't have symbols for lpub3d.exe.

I fixed the --help crash but it looks unrelated to this.

leozide avatar Mar 20 '21 19:03 leozide

I tried opening the dump but I don't have symbols for lpub3d.exe.

Here you go https://github.com/trevorsandy/lpub3d_libs/releases/download/v1.0.1/32bit_release.zip

I fixed the --help crash but it looks unrelated to this.

Indeed, it's unrelated.

Cheers,

trevorsandy avatar Mar 20 '21 19:03 trevorsandy

With the latest, if I unset the DISPLAY environment variable anything I try gets the following:

qt.qpa.screen: QXcbConnection: Could not connect to display Could not connect to any X display.

rsbx avatar Mar 20 '21 19:03 rsbx

Looks like gMainWindow is null. Do you have anything in stdout or stderr?

leozide avatar Mar 20 '21 19:03 leozide

That's the entirety of what was in the terminal. I tried --help and -i test.png CondEdge.mpd; same result for both. With the DISPLAY variable set, both command work.

rsbx avatar Mar 20 '21 20:03 rsbx

that comment was for @trevorsandy

leozide avatar Mar 20 '21 20:03 leozide

I didn't see anything in the log. I'm setting up a more robust debugging, perhaps I could add printf to processCommandLine. Here's my log:

------------Build Checks Start--------------

-builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe found.

  1 OF 7. PKG_CHECK_NATIVE_COMMAND...[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe --process-file --liblego --preferred-renderer native C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_NATIVE FAILED, ELAPSED TIME 0:0:12.03

LPub3D v2.4.1 r16 (Dev-release) for MS Windows 32bit
==========================
Arguments: --process-file --liblego --preferred-renderer native C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd

--------------------------
LPub3D App Data Path.........(C:/Users/Trevor/Projects/Working/AppVeyor_Local/lpub3d-ci/builds/windows/release/LPub3D-Any-2.4.1.16.2635_20210320/LPub3D_x86)
LPub3D Executable Path.......(C:/Users/Trevor/Projects/Working/AppVeyor_Local/lpub3d-ci/builds/windows/release/LPub3D-Any-2.4.1.16.2635_20210320/LPub3D_x86)
LPub3D Log Path..............(C:/Users/Trevor/Projects/Working/AppVeyor_Local/lpub3d-ci/builds/windows/release/LPub3D-Any-2.4.1.16.2635_20210320/LPub3D_x86/logs/LPub3DLog.txt)
LPub3D Portable Distribution.(Yes)
LPub3D Loaded LDraw Library..(LEGO Parts)
--------------------------

Fade Previous Steps is OFF.
Highlight Current Step is OFF.
Added search directory: C:\Users\Trevor\LDraw\MODELS
Added search directory: C:\Users\Trevor\LDraw\unofficial\customParts\p
Added search directory: C:\Users\Trevor\LDraw\unofficial\customParts\parts
Added search directory: C:\Users\Trevor\LDraw\unofficial\helper
Added search directory: C:\Users\Trevor\LDraw\unofficial\helper\helper_images
Added search directory: C:\Users\Trevor\LDraw\unofficial\LSynth
Loading LDraw parts search directories...
Loading archive C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\libraries\complete.zip...
Loading archive C:/Users/Trevor/Projects/Working/AppVeyor_Local/lpub3d-ci/builds/windows/release/LPub3D-Any-2.4.1.16.2635_20210320/LPub3D_x86/libraries/lpub3dldrawunf.zip...
Loading LDraw model file 'C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd'...
Model file build_checks.mpd identified as Multi-Part LDraw System (MPD) Document
Loading MPD model 'build_checks.mpd'...
MPD model 'build_checks.mpd' with 23 lines loaded.
Loading MPD submodel 'submodel-1.ldr'...
MPD submodel 'submodel-1.ldr' with 5 lines loaded.
Part 1 [58120.dat] validated.
Part 2 [2780.dat] validated.
Part 3 [30526.dat] validated.
Part 4 [32000.dat] validated.
Part 5 [3003.dat] validated.
Parts count for build_checks.mpd is 8
MPD model file build_checks.mpd loaded. Part Count 8. Elapsed time: 0.004 second
Build Modifications are Disabled
Fade Previous Steps is OFF.
Highlight Current Step is OFF.
Loading user interface items...
Open file 'C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd' completed.
Display page...
Counting model steps and instances...
Writing submodel 'build_checks.mpd' to temp folder...
Writing submodel 'submodel-1.ldr' to temp folder...
2 submodels written to temp folder. Elapsed time: 0.022 second
Processing find page for build_checks.mpd...
Processing single-step draw-page for page 1, step 1, model 'build_checks.mpd'...
Processing single-step draw-page for page 1, step 1, model 'build_checks.mpd'...
Processing PLI for build_checks.mpd...
Generate PLI image for [Normal] parts...
Executing Native Perspective PLI render - please wait...
Native SMP image file rendered 'C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\LPub3D\parts\58120_71_1240_150_DPI_1_30_23_-45.png'
Native Renderer SMP Arguments: InputFileName: C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\LPub3D\tmp\pli.ldr OutputFileName: C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\LPub3D\parts\58120_71_1240_150_DPI_1_30_23_-45.png TransBackground: True HighlightNewParts: False UseImageSize: False LineWidth: 1 StudStyle: None (0) Resolution: 150 ImageWidth: 1240 ImageHeight: 1753 PageWidth: 1240 PageHeight: 1753 CameraFoV: 45 CameraZNear: 25 CameraZFar: 50000 CameraDistance (Scale 1): 3031329 CameraName: Default CameraProjection: Perspective UsingViewpoint: False ZoomExtents: False CameraLatitude: 23 CameraLongitude: -45 CameraTarget: X(0) Y(0) Z(0)

Icon Inserted: Key [58120_71] Value [C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\LPub3D\parts\58120_71_1240_150_DPI_1_30_23_-45.png]
Native PLI [Normal] render took 461 milliseconds to render image [C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\LPub3D\parts\58120_71_1240_150_DPI_1_30_23_-45.png].
Processing CSI for build_checks.mpd...
LPub3D crashed. Crash information was saved to the file 'C:\Users\Trevor\AppData\Local\Temp\LPub3D.dmp', please send it to the developer for debugging.


  2 OF 7. PKG_CHECK_LDVIEW_COMMAND...[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe --process-file --clear-cache --liblego --preferred-renderer ldview C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_LDVIEW PASSED, ELAPSED TIME 0:0:41.78


  3 OF 7. PKG_CHECK_LDVIEW_SINGLE_CALL_COMMAND...[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe --process-file --clear-cache --liblego --preferred-renderer ldview C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_LDVIEW_SINGLE_CALL PASSED, ELAPSED TIME 0:0:32.59


  4 OF 7. PKG_CHECK_RANGE_COMMAND....[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe --process-export --range 1-3 --clear-cache --liblego --preferred-renderer ldglite C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_RANGE PASSED, ELAPSED TIME 0:0:11.35


  5 OF 7. PKG_CHECK_POV_COMMAND......[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe --process-file --clear-cache --liblego --preferred-renderer povray C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_POV PASSED, ELAPSED TIME 0:1:17.56


  6 OF 7. PKG_CHECK_TENTE_COMMAND......[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe --process-file --clear-cache --libtente --preferred-renderer ldview C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\TENTE\astromovil.ldr]
-PKG_CHECK_TENTE PASSED, ELAPSED TIME 0:1:18.32


  7 OF 7. PKG_CHECK_VEXIQ_COMMAND......[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210320\LPub3D_x86\LPub3D.exe --process-file --clear-cache --libvexiq --preferred-renderer ldview-scsl C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\VEXIQ\spider.mpd]
-PKG_CHECK_VEXIQ PASSED, ELAPSED TIME 0:3:52.52


  Build checks cleanup...

  Copying LPub3D_x86_RunLog.txt to log asset folder...

----Build Checks Completed: PASS (6)[2,3,4,5,6,7], FAIL (1)[1], ELAPSED TIME 0:8:7.27 ----


-Configure LPub3D x86_64 build environment...

Cheers,

trevorsandy avatar Mar 20 '21 21:03 trevorsandy

Ok, I think I found the source of my abnormal end and I think it is the source of the X Server connect message also.

Basically, in console mode, gMainWindow is not instantiated but there are calls to it. My trace above shows one such example where a call is made to SaveTabLayout() from SetProject().

void lcApplication::SaveTabLayout() const
{
	if (!mProject || mProject->GetFileName().isEmpty())
		return;

	QSettings Settings;
	QByteArray TabLayout = gMainWindow->GetTabLayout();

	Settings.setValue(GetTabLayoutKey(), TabLayout);
}

As you can see, QByteArray TabLayout = gMainWindow->GetTabLayout() is not safe and this is precisely where I can reproduce an abnormal end in LPub3D when performing what would be a 'CLI' call to render an image. It's strange, because I was not able to get the debugger to step into SaveTabLayout() in LeoCAD so this behaviour may be masking the problem there. My lcLib library does enter and produce an abend in LPub3D.

Here's another unsafe example. InitializeRenderer() is called before gMainWindow is initialized in lcApplication::Initialize(). If gSupportsFramebufferObject is null, the application will abend.

bool lcContext::InitializeRenderer()
{
...
	if (!gSupportsFramebufferObject)
		gMainWindow->GetPartSelectionWidget()->DisableIconMode();

	return true;
}

With these two corrections, my Native (LeoCAD) renderer check is successfully executing on Windows. Linux (Arch, Fedora, Ubuntu) checks are still failing so I believe there are likely other unsafe items.

------------Build Checks Start--------------

-builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210321\LPub3D_x86\LPub3D.exe found.
-mainApp\32bit_release\LPub3D.pdb found.

  1 OF 7. PKG_CHECK_NATIVE_COMMAND...[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210321\LPub3D_x86\LPub3D.exe --process-file --liblego --preferred-renderer native C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_NATIVE PASSED, ELAPSED TIME 0:0:21.51


  2 OF 7. PKG_CHECK_LDVIEW_COMMAND...[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210321\LPub3D_x86\LPub3D.exe --process-file --clear-cache --liblego --preferred-renderer ldview C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_LDVIEW PASSED, ELAPSED TIME 0:0:41.72


  3 OF 7. PKG_CHECK_LDVIEW_SINGLE_CALL_COMMAND...[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210321\LPub3D_x86\LPub3D.exe --process-file --clear-cache --liblego --preferred-renderer ldview C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_LDVIEW_SINGLE_CALL PASSED, ELAPSED TIME 0:0:32.96


  4 OF 7. PKG_CHECK_RANGE_COMMAND....[builds\windows\release\LPub3D-Any-2.4.1.16.2635_20210321\LPub3D_x86\LPub3D.exe --process-export --range 1-3 --clear-cache --liblego --preferred-renderer ldglite C:\Users\Trevor\Projects\Working\AppVeyor_Local\lpub3d-ci\builds\check\build_checks.mpd]
-PKG_CHECK_RANGE PASSED, ELAPSED TIME 0:0:10.73
...

Cheers,

trevorsandy avatar Mar 21 '21 07:03 trevorsandy

I think this is caused by your changes. When I run from the command line SetProject is only called once so mProject is null when SaveTabLayout is called and it exits on the first line.

There should probably be some more checks for gMainWindow but saving images from the command line works fine for me.

leozide avatar Mar 21 '21 19:03 leozide

Not for me:

/tmp$ ./leocad -i test.png CondEdge.mpd QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-internet' Saved 'test.png'. /tmp$ unset DISPLAY /tmp$ ./leocad -i test1.png CondEdge.mpd qt.qpa.screen: QXcbConnection: Could not connect to display Could not connect to any X display.

rsbx avatar Mar 21 '21 19:03 rsbx

I think this is caused by your changes. When I run from the command line SetProject is only called once so mProject is null when SaveTabLayout is called and it exits on the first line.

This was my impression also; hence, why I was not able to produce the behaviour with LeoCAD. But I believe running headless is likely failing because of an unsafe mainWindow call somewhere. I’ll check on Linux to see what’s going on next weekend.

Cheers,

trevorsandy avatar Mar 21 '21 20:03 trevorsandy

so this turns out to still be a major issue: here is the latest output from running my l2cu script that uses leocad as its main CLI renderer:

Screenshot_20211019-224400_Remotix.png

there has to be some way to make this thing completely UI independent the same way LDView has an ldview-osmesa headless rendering utility....

nathaneltitane avatar Oct 20 '21 02:10 nathaneltitane

this is still an issue with the latest builds where qt dependencies still prevent leocad from fully operating without a display or session spec

nathaneltitane avatar Nov 21 '21 22:11 nathaneltitane

+1 to see this fixed

garutilorenzo avatar Feb 27 '24 10:02 garutilorenzo