primus icon indicating copy to clipboard operation
primus copied to clipboard

CS GO crash every time it exits

Open AnAkkk opened this issue 10 years ago • 37 comments
trafficstars

Since CS GO has been ported to linux, I've been running it through primusrun and always had a very annoying issue: it crashes every time I quit the game, which produces a core dump and freeze my laptop for up to 30s every time the game is exited. I can't disable core dumps because I need them to debug crashes in applications I develop.

The backtrace (which doesn't have symbols) is the following:

0 0xeac140f6 in ?? ()

1 signal handler called

2 0xf7409546 in pthread_mutex_lock () from /usr/lib32/libpthread.so.0

3 0xf7180419 in ?? () from /usr/lib32/nvidia/libGL.so.1

4 0xf715ad0d in ?? () from /usr/lib32/nvidia/libGL.so.1

5 0xf715b4cc in ?? () from /usr/lib32/nvidia/libGL.so.1

6 0xf7154f68 in ?? () from /usr/lib32/nvidia/libGL.so.1

7 0xf7149e24 in glXDestroyPbuffer () from /usr/lib32/nvidia/libGL.so.1

8 0xf7532b04 in ?? () from /usr/lib32/primus/libGL.so.1

9 0xf7547c6a in ?? () from /usr/lib32/primus/libGL.so.1

10 0xf7547c5c in ?? () from /usr/lib32/primus/libGL.so.1

11 0xf7547c5c in ?? () from /usr/lib32/primus/libGL.so.1

12 0xf7547d1d in ?? () from /usr/lib32/primus/libGL.so.1

13 0xf758fc4c in __cxa_finalize () from /usr/lib32/libc.so.6

14 0xf752cc13 in ?? () from /usr/lib32/primus/libGL.so.1

15 0xf77af294 in _dl_fini () from /lib/ld-linux.so.2

16 0xf758f8c3 in __run_exit_handlers () from /usr/lib32/libc.so.6

17 0xf758f921 in exit () from /usr/lib32/libc.so.6

18 0xf757965a in __libc_start_main () from /usr/lib32/libc.so.6

19 0x08048645 in _start ()

AnAkkk avatar Apr 18 '15 12:04 AnAkkk

Please also show the console output if there's something else apart of Segmentation fault. Core dumped.

What is your distribution and libc version?

Please run the game with primusrun env LD_DEBUG=libs LD_DEBUG_OUTPUT=/tmp/ld-debug.txt and provide the resulting /tmp/ld-debug.txt file.

amonakov avatar Apr 19 '15 13:04 amonakov

There's nothing else than Segmentation fault.

I'm on ArchLinux 64bit with libc 2.21.

It created 4 /tmp/ld-debug.txt.XXX files with different PIDs. Do you have an email where I can send them?

AnAkkk avatar Apr 19 '15 16:04 AnAkkk

My email is username@gmail.com — or you can put them into a Github gist.

amonakov avatar Apr 19 '15 16:04 amonakov

I've sent them. I've recompiled lib32-primus with debug symbols, this might be more helpful:

#0 0xeab620f6 in ?? () #1 signal handler called #2 0xf7353546 in pthread_mutex_lock () from /usr/lib32/libpthread.so.0 #3 0xf70ca419 in ?? () from /usr/lib32/nvidia/libGL.so.1 #4 0xf70a4d0d in ?? () from /usr/lib32/nvidia/libGL.so.1 #5 0xf70a54cc in ?? () from /usr/lib32/nvidia/libGL.so.1 #6 0xf709ef68 in ?? () from /usr/lib32/nvidia/libGL.so.1 #7 0xf7093e24 in glXDestroyPbuffer () from /usr/lib32/nvidia/libGL.so.1 #8 0xf747cb34 in DrawableInfo::~DrawableInfo() () from /usr/lib32/primus/libGL.so.1 #9 0xf7491c8a in std::_Rb_tree<unsigned long, std::pair<unsigned long const, DrawableInfo>, std::_Select1st<std::pair<unsigned long const, DrawableInfo> >, std::less, std::allocator<std::pair<unsigned long const, DrawableInfo> > >::_M_erase(std::Rb_tree_node<std::pair<unsigned long const, DrawableInfo> >) () from /usr/lib32/primus/libGL.so.1 #10 0xf7491c7c in std::_Rb_tree<unsigned long, std::pair<unsigned long const, DrawableInfo>, std::_Select1st<std::pair<unsigned long const, DrawableInfo> >, std::less, std::allocator<std::pair<unsigned long const, DrawableInfo> > >::_M_erase(std::Rb_tree_node<std::pair<unsigned long const, DrawableInfo> >) () from /usr/lib32/primus/libGL.so.1 #11 0xf7491c7c in std::_Rb_tree<unsigned long, std::pair<unsigned long const, DrawableInfo>, std::_Select1st<std::pair<unsigned long const, DrawableInfo> >, std::less, std::allocator<std::pair<unsigned long const, DrawableInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, DrawableInfo> >*) () from /usr/lib32/primus/libGL.so.1 #12 0xf7491d3d in PrimusInfo::~PrimusInfo() () from /usr/lib32/primus/libGL.so.1 #13 0xf74d9c4c in __cxa_finalize () from /usr/lib32/libc.so.6 #14 0xf7476c13 in __do_global_dtors_aux () from /usr/lib32/primus/libGL.so.1 #15 0xf76f9294 in _dl_fini () from /lib/ld-linux.so.2
#16 0xf74d98c3 in __run_exit_handlers () from /usr/lib32/libc.so.6
#17 0xf74d9921 in exit () from /usr/lib32/libc.so.6
#18 0xf74c365a in __libc_start_main () from /usr/lib32/libc.so.6
#19 0x08048645 in _start ()

AnAkkk avatar Apr 19 '15 16:04 AnAkkk

From the (privately sent) log it looks like glibc decides to run both nVidia and Mesa libGL destructors before primus. I suspect that is not supposed to happen. I'll try to look if anything goes wrong in glibc.

amonakov avatar Apr 19 '15 16:04 amonakov

Have you had time to look? I use primusrun in Dota 2 and TF2 too and I don't have the same issue, it seems to only happen in CS GO.

AnAkkk avatar May 11 '15 14:05 AnAkkk

ping Anything new here? :)

AnAkkk avatar Jun 12 '15 09:06 AnAkkk

possibly related:

libGL DSO finalizer and pthreads

When a multithreaded OpenGL application exits, it is possible for libGL's DSO finalizer 
(also known as the destructor, or "_fini") to be called while other threads 
are executing OpenGL code. The finalizer needs to free resources allocated by libGL. 
This can cause problems for threads that are still using these resources. 
Setting the environment variable "__GL_NO_DSO_FINALIZER" to "1" will work around 
this problem by forcing libGL's finalizer to leave its resources in place. 
These resources will still be reclaimed by the operating system when the process exits.
Note that the finalizer is also executed as part of dlclose(3), 
so if you have an application that dlopens(3) and dlcloses(3) libGL repeatedly, 
"__GL_NO_DSO_FINALIZER" will cause libGL to leak resources until the process exits. 
Using this option can improve stability in some multithreaded applications, 
including Java3D applications.

http://us.download.nvidia.com/XFree86/Linux-x86_64/352.09/README/knownissues.html

tpruzina avatar Jun 14 '15 16:06 tpruzina

Many thanks, this seem to fix the issue. Should this be included in primus by default?

AnAkkk avatar Jun 14 '15 17:06 AnAkkk

Same here. Any elegant way to solve this ?

presianbg avatar Jun 15 '15 19:06 presianbg

__GL_NO_DSO_FINALIZER simply hides the issue, so it's not appropriate to use it.

I now understand the issue: primus cannot expect that it can invoke functions from a shared library it dlopen'ed from its own destructors; since nVidia's constructors run after primus' (due to dlopen), it's actually natural that destructors are run before (i.e. in reverse order of constructors). Even though primus still has a handle to nVidia's dlopen'ed libGL, it doesn't "count" when destructors are run at exit.

Can you please test the following patch, ideally on multiple games, not just CS:GO? Sorry for taking so long, and thanks for your patience.

diff --git a/libglfork.cpp b/libglfork.cpp
index 03f514f..bb42f0d 100644
--- a/libglfork.cpp
+++ b/libglfork.cpp
@@ -259,6 +259,22 @@ static struct PrimusInfo {
   }
 } primus;

+static void cleanup()
+{
+  primus.drawables.clear();
+}
+
+static void register_cleanup_1()
+{
+  atexit(cleanup);
+}
+
+static void register_cleanup()
+{
+  static pthread_once_t once = PTHREAD_ONCE_INIT;
+  pthread_once(&once, register_cleanup_1);
+}
+
 // Thread-specific data
 static __thread struct {
   Display *dpy;
@@ -622,11 +638,6 @@ GLXContext glXCreateContextAttribsARB(Display *dpy, GLXFBConfig config, GLXConte
 void glXDestroyContext(Display *dpy, GLXContext ctx)
 {
   primus.contexts.erase(ctx);
-  // kludge: reap background tasks when deleting the last context
-  // otherwise something will deadlock during unloading the library
-  if (primus.contexts.empty())
-    for (DrawablesInfo::iterator i = primus.drawables.begin(); i != primus.drawables.end(); i++)
-      i->second.reap_workers();
   primus.afns.glXDestroyContext(primus.adpy, ctx);
 }

@@ -720,6 +731,7 @@ void glXSwapBuffers(Display *dpy, GLXDrawable drawable)
     di.actx = ctx;
     di.d.spawn_worker(drawable, display_work);
     di.r.spawn_worker(drawable, readback_work);
+    register_cleanup();
   }
   // Readback thread needs a sync object to avoid reading an incomplete frame
   di.sync = primus.afns.glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);

amonakov avatar Jun 17 '15 10:06 amonakov

@amonakov Hi,

Thank you for the work you have done. Unfortunately I can't build pathced version of primus under Fedora 22. Can you guide me how to do it?

Best Regards, Presian

presianbg avatar Jun 17 '15 11:06 presianbg

This patch works fine on CS GO, CS 1.6 and TF2.

AnAkkk avatar Jun 17 '15 11:06 AnAkkk

@AnAkIn1

Hi, any tips how to build patched primus on Fedora 22 ?

presianbg avatar Jun 17 '15 11:06 presianbg

No, I'm on Arch Linux.

AnAkkk avatar Jun 17 '15 11:06 AnAkkk

@AnAkIn1

So It should basically the same. Just point me the steps you follow to patch it.

Thank you in advance.

presianbg avatar Jun 17 '15 11:06 presianbg

Well, not really, Arch Linux uses PKGBUILD. I don't know Fedora build system, but you need to use the patch command anyway:

patch -p1 -i "file"

AnAkkk avatar Jun 17 '15 11:06 AnAkkk

Yep, I'm familiar with the patching proccess. I can't build the patched version. But no problem, they should roll it out soon or later...

Cheers ;)

presianbg avatar Jun 17 '15 12:06 presianbg

for fedora 22 patched version see:

http://people.engr.ncsu.edu/gsgatlin/primus-1.1.03282015-2.fc22.src.rpm and http://people.engr.ncsu.edu/gsgatlin/primus-1.1.03282015-2.fc22.x86_64.rpm

link to specfile showing added patch.

http://fpaste.org/233006/42994143/

This package was build using "mock" on fedora 21 for fedora 22.

Hope that helps out your testing of this problem.

gsgatlin avatar Jun 17 '15 12:06 gsgatlin

@gsgatlin Thank you :)

@amonakov With the patched version above, CS:GO still making core dumbs: root 3092 71.0 0.1 130084 20488 ? R 15:29 0:00 /usr/lib/systemd/systemd-coredump 2812 1000 1000 11 1434544161 csgo_linux

rpm -qa | grep primus primus-1.0.07112014-1.fc22.i686 primus-1.1.03282015-2.fc22.x86_64

presianbg avatar Jun 17 '15 12:06 presianbg

If CS:GO is a 32-bit executable, you need new i686.rpm as well.

amonakov avatar Jun 17 '15 12:06 amonakov

try:

http://people.engr.ncsu.edu/gsgatlin/primus-1.1.03282015-2.fc22.i686.rpm

gsgatlin avatar Jun 17 '15 12:06 gsgatlin

@amonakov and @gsgatlin

You guys are awesome! It's working :+1: To be honest, I was convinced that Steam games are 64-bit executables.

And again... Big thanks!

presianbg avatar Jun 17 '15 13:06 presianbg

Just to follow up, I tested the patched primus rpm with Minecraft, FEZ, Cogs, and dolphin-emu and did not see any problems.

gsgatlin avatar Jun 18 '15 01:06 gsgatlin

@amonakov : can you please merge the fix?

AnAkkk avatar Jun 25 '15 19:06 AnAkkk

did anybody test it out with other games so far? Not that this one will break others.

karolherbst avatar Jun 27 '15 19:06 karolherbst

Tested on Plague:Inc, CS:GO, Dota 2 Source 1/2. Everything looks fine to me.

presianbg avatar Jun 28 '15 06:06 presianbg

Anything new about this?

AnAkkk avatar Nov 15 '15 10:11 AnAkkk

@amonakov ?

AnAkkk avatar Dec 09 '15 12:12 AnAkkk

@amonakov Do you need more testing (I can try SuperTux kart on my setup for instance) or could this be merged?

ArchangeGabriel avatar Dec 27 '15 11:12 ArchangeGabriel

It's been more than 6 months now...and there have been no commits since March 2015, it doesn't look like this project is still maintained :/

AnAkkk avatar Jan 21 '16 20:01 AnAkkk

I added this patch into the fedora rpm package back in July (I think) and it did not seem to cause any problems that I know about.

gsgatlin avatar Jan 21 '16 20:01 gsgatlin

yeah, no issues on my side either.

karolherbst avatar Jan 21 '16 21:01 karolherbst

I guess it's time for somebody like @karolherbst to fork it and become new de-facto maintainer. Or some competent package maintainer from some distro.

tpruzina avatar Jan 22 '16 03:01 tpruzina

nah I would rather spend time to improve nouveau, because prime offloading is the superior solution anyway.

karolherbst avatar Jan 22 '16 08:01 karolherbst

@karolherbst guessed as much by looking at your nouveau commits and mailing list jitter.

tpruzina avatar Jan 22 '16 22:01 tpruzina

Yes, but in the meantime, it’s nice to have a temporary working solution. But I agree @karolherbst is doing a great job on reclocking, plus OpenGL going well in mesa lately, I might be able to drop the proprietary driver soon.

@amonakov I’ve seen from you GitHub page that you’re still around. ;) Could you consider merging this and cleaning a bit the issue tracker? Or, in the event you’re not interesting in it anymore, which I can understand, could you envisage to transfer the repo to the Bumblebee Project organization? Thanks alot!

ArchangeGabriel avatar May 05 '16 11:05 ArchangeGabriel