auto-mcs
auto-mcs copied to clipboard
[bug] psutil doesn't always kill unix-based processes (killing Java when the server stops hangs)
Describe the bug Using telepath on macos, pressing Command + Q shuts down the server but then it starts thinking it is in a deadlocked state
To Reproduce Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
auto-mcs Configuration
- Clarify if you're using Telepath YES
- Clarify if you're using the GUI, headless, or Docker GUI
Operating System and platform (please complete the following information):
- OS: host: windows 11, telepath client, macos sonoma
- Architecture host: host: x64 telepath client: arm64]
| OS | Path |
|---|---|
| Windows | %appdata%\.auto-mcs\Logs |
| macOS | ~/Library/Application Support/auto-mcs/Logs |
| Linux | ~/.auto-mcs/Logs |
Expected behavior It properly shuts down the server
Screenshots
Additional context Sorry, if i'm creating too many issues
This isn't a bug with auto-mcs, it's a big with fabric and whatever mods you're using. Fabric has an odd shutdown process, and you can close the server by pressing the skull to kill it, or press CMD-Q again. This does not happen in Vanilla, and isn't a bug with auto-mcs. It's simply added functionality that detects when the server should be closed, but the server doesn't close itself
This isn't a bug with auto-mcs, it's a big with fabric and whatever mods you're using. Fabric has an odd shutdown process, and you can close the server by pressing the skull to kill it, or press CMD-Q again. This does not happen in Vanilla, and isn't a bug with auto-mcs. It's simply added functionality that detects when the server should be closed, but the server doesn't close itself
@macarooni-man I tried to press the skull and Command + Q nothing happens.
@kokofixcomputers please provide detailed reproduction steps including how you made the server, your OS version, and I'll reopen if I can reproduce it
@kokofixcomputers please provide detailed reproduction steps including how you made the server, your OS version, and I'll reopen if I can reproduce it
@macarooni-man Thanks: OS: Windows Machine Host, Mac as telepath client Using Telepath: YES Server type: Fabric Reproduce steps:
- Have telepath setup
- Create a fabric server with the version 1.21.4
- Have the server running for a bit
- Go into the telepath client and press Command + Q
- Sometimes it works, sometimes it doesn't (most times it doesn't)
@kokofixcomputers does this happen locally as well? And what about on Vanilla over Telepath?
@kokofixcomputers does this happen locally as well? And what about on Vanilla over Telepath?
I'm testing
@kokofixcomputers thank you!
Thanks for this awesome application!
Huh? Now it's stuck on Creating initial backup while the progress bar is green and says 100%
The problem is only with fabric @macarooni-man I will try a few more times to see if it works still. But the kill button i think never worked to me
9:01:26 AM [INIT] > 'Survival' has stopped successfully 9:01:28 AM [WARN] > 'Survival' is deadlocked, please kill it above to continue...
well it does say has stopped successfully for some reason
@macarooni-man The only way to resolve this is to force quit the host???
@kokofixcomputers, I guess what I'm asking is this:
- Does this happen in Vanilla over Telepath?
- Does this happen when running Fabric locally?
@kokofixcomputers, I guess what I'm asking is this:
- Does this happen in Vanilla over Telepath? NO
- Does this happen when running Fabric locally? YES
@macarooni-man I asked AI to see why killing is not working: 2. Process Name Matching Might Be Insufficient
Your code checks for java.exe (Windows) or java (Linux/macOS) by name. However, sometimes the Minecraft server may run as javaw.exe on Windows, which your code does not account for.
If the process name differs, your script will not find and kill the correct process.
Hopefully this helps you solve this problem!
Thanks for making and maintaining a awesome app!
thanks @kokofixcomputers!
actually, auto-mcs has complete control over the Java wrapper, and it launches specifically from an internally managed Java environment. It's only an issue with Fabric, and due to it only being an issue on macOS and Linux, my guess is that it's caused by the kill command sending the wrong termination code to the process. I need to force a SIGTERM, and I'm using psutil for that. It's possible there is a better solution but I'd have to look into this more
thanks @kokofixcomputers!
actually, auto-mcs has complete control over the Java wrapper, and it launches specifically from an internally managed Java environment. It's only an issue with Fabric, and due to it only being an issue on macOS and Linux, my guess is that it's caused by the kill command sending the wrong termination code to the process. I need to force a SIGTERM, and I'm using psutil for that. It's possible there is a better solution but I'd have to look into this more
Hmm... I think the issue also occurs on windows
@kokofixcomputers can you send an .amb backup of your server? I'm unable to reproduce this on a stock Fabric server
@kokofixcomputers can you send an .amb backup of your server? I'm unable to reproduce this on a stock Fabric server
It takes a while. I think it happens after letting the server run for like 30 min. So, take your time. No rush. Can i send you the backup file through email? Github doesn't accept the file format.
@kokofixcomputers can you send an .amb backup of your server? I'm unable to reproduce this on a stock Fabric server
It takes a while. I think it happens after letting the server run for like 30 min. So, take your time. No rush. Can i send you the backup file through email? Github doesn't accept the file format.
It's too large for Github, not the file format. You can upload it to a file sharing service and send the link, but yeah if it only happens after 30 minutes that's not something I wish to spend my time troubleshooting, I hope you can understand lol. If you can find another way to make it happen immediately I'm happy to spend my time troubleshooting
@macarooni-man Hey! So I made a bunch of other tests, and apparently, the kill button is already bugged. It won't kill Vanilla either. So maybe the issue with fabric not being able to be killed stems from something that is per the whole AutoMCS handling and not just Fabric handling?
I tried digging into AutoMCS's code but I still don't see how menu.py attempts to kill it.
Thanks!
@kokofixcomputers can you provide any other information about the setup? What OS is this on, and more importantly, is it over Telepath?
@kokofixcomputers can you provide any other information about the setup? What OS is this on, and more importantly, is it over Telepath?
I tried both with and without telepath it both have a broken kill button. I only tried windows as the host. but macos as the telepath client.
@macarooni-man And apparently, when using the kill button, java is killed successfully. But it didn't kill the console window host and others
Some of these didn't exist before starting the server And maybe it didn't register it as killed just because other processes are still running?
@kokofixcomputers It's impossible to narrow this down without detailed information or a server backup. I'm unable to reproduce this issue, and it leads me to believe it might be a conflict with how the subprocess module interacts on your system. Can you please try the following cases and additionally send me the .amb file of that fabric server?
Additionally, are you using the release binary, or the fork you created? Because disabling recursive processes like I suggested yesterday would lead to this exact behavior
- Windows Vanilla (no Telepath)
- Windows Fabric (no Telepath)
And try these cases on another system if possible
@kokofixcomputers It's impossible to narrow this down without detailed information or a server backup. I'm unable to reproduce this issue, and it leads me to believe it might be a conflict with how the subprocess module interacts on your system. Can you please try the following cases and additionally send me the .amb file of that fabric server?
Additionally, are you using the release binary, or the fork you created? Because disabling recursive processes like I suggested yesterday would lead to this exact behavior
- Windows Vanilla (no Telepath)
- Windows Fabric (no Telepath)
And try these cases on another system if possible
- Windows Vanilla (no Telepath) kill button doesn't work
- Windows Fabric (no Telepath) kill button doesn't work
- MacOS Sonoma Vanilla (no Telepath) kill button doesn't work @macarooni-man This is what i got
If you want the file to the Fabric server: https://mega.nz/file/oxgiWJoT#gOou6byphwtckbM1Cwo4QE6biQeEUjJ0Yu43bIlEGbo
I caught something during testing:
The timer is still going up?!
and:
The ip is still there? But it thinks its off?
Well... it says deaklocked from a telepath client so at least this is getting somewhere
@kokofixcomputers It's impossible to narrow this down without detailed information or a server backup. I'm unable to reproduce this issue, and it leads me to believe it might be a conflict with how the subprocess module interacts on your system. Can you please try the following cases and additionally send me the .amb file of that fabric server?
Additionally, are you using the release binary, or the fork you created? Because disabling recursive processes like I suggested yesterday would lead to this exact behavior
- Windows Vanilla (no Telepath)
- Windows Fabric (no Telepath)
And try these cases on another system if possible
I am using the release binary. Not the fork
There are two issues in particular, and in this case @kokofixcomputers is experiencing both:
-
Fabric, with certain mods, doesn't actually close the server when stopped. It gets hung on a background thread that doesn't seem to close on Windows utilizing the
taskkill /fcommand -
On Telepath, sometimes the client's panel doesn't reset after the server stops. Consider looking into the
ConsolePanel.reset_panel()method