new-session-manager icon indicating copy to clipboard operation
new-session-manager copied to clipboard

sigkill

Open grammoboy2 opened this issue 2 years ago • 2 comments

Comment on the sigkill patch:

For some reason, I've several app that don't seems to get killed lately in nsm. Fluajho, Mamba, Zynaddsubfx. Wild guess hints to jack metadata / pretty names? Just a very wild uneducated guess, could be as well be my system. Anyway, those apps should get killed nicely, without having to use sigkill, which seems to be bad practice. This could lead to a bad nsm ecosystem in the end I'm afraid, with misbehaving applications which gets tolerated. See the bashguide, subtitle: "I'm trying to kill -9 my job but blah blah blah..."

https://mywiki.wooledge.org/ProcessManagement?highlight=%28sigkill%29

grammoboy2 avatar Aug 01 '22 13:08 grammoboy2

Also the 60 seconds wait time, might not be a wild guess. I can replicate a (not too practical, granted) session which takes 60 seconds to abort cleanly. (Why does it take so long, is a other question I guess.)

grammoboy2 avatar Aug 01 '22 15:08 grammoboy2

I agree that finding out why clients hang has the priority. I experienced that myself and went on a bug hunt in my own programs, only to find out that Carla did the same. I don't know how jack metada could have anything to do with it though. Jack metadata does not need to be deleted manually by the clients, they can just quit.

And I admit that I still need to test if this 30/60s wait time happens after the session is closed or during the close operation. I thought it was after all clients had already reported, that they are done quitting. And THEN the process is still present.

The incentive for this is a FIXME comment in the code by Jonathan btw, to check on hanging clients and purge them. There are two places in the code that are copy pasted. My recent code change just handles one of them, the more common one.

diovudau avatar Aug 03 '22 20:08 diovudau

Why does it take so long, is a other question I guess.)

Looks like clients (like Carla, drumkv1_jack, ... ) are just responding quite slow on SIGTERM. Even when doing killall outside nsm, they're slow to respond (at least when multiple instances are running).

grammoboy2 avatar Nov 22 '22 23:11 grammoboy2

SIGKILL as last resort is a option in this scenario probably (people won't die), with as disadvantage that bugs in clients might not get noticed / reported / improved / fixed.

But FYI some sessions seems to close cleanly, even after 60 seconds.

grammoboy2 avatar Nov 24 '22 13:11 grammoboy2

FYI: Messing (quite literally) around with the code, adding osc_server->wait( 100 ); in close_session(), seems to improve closing a session significantly. I leave it to real C++/liblo coders to answer why, but at least there is an impact with a session with lot of clients (tested with a bunch of Mamba clients). Maybe it is useful to get this improved.

void
close_session ( )
{
    if ( ! session_path )
        return;

    for ( std::list<Client*>::iterator i = client.begin();
          i != client.end();
          ++i )
    {
        command_client_to_quit( *i );
        osc_server->wait( 100 ); 
      
    }

    wait_for_killed_clients_to_die();

grammoboy2 avatar Nov 25 '22 15:11 grammoboy2