godot icon indicating copy to clipboard operation
godot copied to clipboard

Implement a "Recovery Mode" for recovering crashing projects during initialization

Open rsubtil opened this issue 1 year ago • 29 comments

Implements https://github.com/godotengine/godot-proposals/issues/2628

[!note] This description is updated whenever any significant changes are made, and should always reflect the current state of this PR

Overview

This adds a new special "Recovery Mode" that, as described, disables some engine features that can make a project fail to launch during startup. When this mode is launched, the following features are completely disabled:

  • Tool scripts (GDScript and C#)
  • Editor plugins
  • GDExtension addons
  • Scene restoring

This is displayed to the user as soon as the project is opened in this mode, through both a popup, and a notification:

image image

Because some of these features may be necessary for a project to work properly, this mode may put the project in an unusable/unworkable state. Since this is a temporary mode meant to allow only for some basic troubleshooting, a few other changes were done as well to reflect this:

  • All launching functionality is disabled, with the run bar modified to reflect it image
  • The Asset Library and Game View plugins are disabled, since they can't really work during this mode image
  • Tool scripts are displayed differently to clarify that they're not being run during this mode image
  • The plugin configuration panel has a warning to clarify plugins do not run during this mode image

Usage

Automatic detection

When launching a project, Godot creates a lock file at user://.recovery_mode_lock. This file persists during the engine initialization, and is removed when the editor starts scanning the project files, after one second.

If any crash/hang happens during initialization, this file is not removed. So, if this file exists the next time Godot attempts to open the project, then the engine knows the project failed to open last time, thus prompting the user to open it in this mode.

CLI

This mode can be manually triggered by opening a project with the new --recovery-mode flag:

$ godot -e -path <...> --recovery-mode
$ godot --help

Godot Engine v4.4.dev.mono.custom_build.f7a0552d5 (2024-12-06 14:59:43 UTC) - https://godotengine.org
...
Run options:
...
  --recovery-mode                   E  Start the editor in recovery mode, which disables features that can typically cause startup crashes, such as tool scripts, editor plugins, GDExtension addons, and others.
...

UI

When a project is opened, the project manager searches for any existing lock file; if that's the case, it opens a popup to suggest using this mode, explaining what it does and its intended usage.

image

It is also possible to use this mode manually from the new options button next to the Edit button:

image

Test projects

To test this functionality, I've made a test suite of projects that crash the engine on initialization with typical user scenarios:

  • GDScript tool script: - Project with a bad tool script, which is attached to a node on the main scene.
  • C# tool script: - Same as above, but with a C# script, which requires rebuilding the project when the script is changed.
  • Editor plugin: - Project with a bad editor plugin, which internally has bad GDScript code.
  • GDExtension addon: - Project with an external error in the GDExtension addon.

All of these test cases fail to open on a normal Godot session and require manual intervention to be recovered. But under recovery mode, every test must open successfully.

GDScript tool script C# tool script Editor plugin GDExtension addon
Crash tool_gdscript_crash.zip tool_csharp_crash.zip plugin_crash.zip gdextension_crash.zip
Hang tool_gdscript_hang.zip tool_csharp_hang.zip plugin_hang.zip gdextension_hang.zip
  • Crash: This project has some bad code that immediately crashes the engine when opened.
  • Hang: This project has some bad code that loops infinitely, which hangs the engine and leaves it frozen until forcibly closed.

Future work

This is a basic implementation so far, and the usefulness of this mode should be improved with other additional features:

  • ~~An automatic mechanism to detect failing scenarios and automatically prompt users to enable this mode, as described in https://github.com/godotengine/godot-proposals/issues/2628~~ (edit: implemented in this PR too)
  • Being able to open/edit scenes when some dependencies are missing, since these can originate from temporarily disabled editor/GDExtension plugins.
  • Adding other problematic features for recovery mode that I didn't cover (e.g. importing assets)

rsubtil avatar May 30 '24 18:05 rsubtil

Is the yellow border for the safe mode dialog necessary? It won't even be seen by most since windows aren't embedded by default in the editor.

YeldhamDev avatar May 30 '24 18:05 YeldhamDev

Is the yellow border for the safe mode dialog necessary? It won't even be seen by most since windows aren't embedded by default in the editor.

Good point, I don't use the default setting of separated windows :sweat_smile:

It was more of an experiment, but I believe the yellow banner at the run bar is already noticeable enough :+1:

rsubtil avatar May 30 '24 19:05 rsubtil

Note that we could automatically prompt the user to edit a project in safe mode after an editor (not project) crash. To do so, we need to write a file in the project's .godot/ folder when the editor crashes. This can be accomplished in the crash handler when Engine::get_singleton()->is_editor_hint() is true.

When opening any project, the editor checks for the presence of this file. If the file is present, a dialog is spawned to prompt the user to open the project in safe mode. Regardless of the user's answer, the file is removed afterwards when the editor is done initializing and rescanning things.

When opening a project from the command line, the project opens as usual, but a warning message is printed to warn the user that they can use --safe-mode if desired (this is done to avoid breaking automation). The file is then removed after the editor is done initializing and rescanning things.

After doing this, we probably won't need to have the Edit (Safe Mode) button anymore in the project manager (which I feel is too prominent right now). We could make it manually accessible by shift-clicking, but I think the automatic prompt will suit the 90% use case.

A similar approach could be used to write a "lock file" in .godot/ to warn the user when opening the Godot editor multiple times at once on the same project. (This is sometimes desired, but it comes with some edge cases so it probably makes sense to warn about it.)

Calinou avatar Jun 01 '24 09:06 Calinou

@Calinou yes, I think this an important mechanism to complement this feature. I hadn't included it yet because, actually, I wanted to confirm this detail that you've mentioned, before proceeding:

This can be accomplished in the crash handler when Engine::get_singleton()->is_editor_hint() is true.

Do we really want this mode to be triggered on any editor crash? I think it would generate more false positives (e.g. internal engine crashes) and generate user confusion because this mode would implicitly be triggered for all types of crashes, when it can only help in troubleshooting a subset of them.

I'm more concerned in ensuring we avoid false positives, and limiting this mode to only trigger on initialization crashes helps to reduce that. The usefulness of this mode, in my opinion, is tied to situations where a user lost the ability to edit the project at all and requires some form of external intervention to recover them.

rsubtil avatar Jun 01 '24 11:06 rsubtil

Maybe we could create some "init" file when starting editor and delete it once the editor loads. If it doesn't load it means it crashed. Though I suppose there are some edge cases like exporting from CLI etc. so not sure if it can be done reliably.

KoBeWi avatar Jun 01 '24 12:06 KoBeWi

@KoBeWi should I add the auto-detect crash functionality we discussed already in this PR, or do it in another PR after this one is merged?

I think @Calinou brings a valid point in doing it right now; if this functionality is implemented, there isn't much need to add the new "Edit in Safe Mode" button to the project manager.

I don't expect this functionality to take much effort to implement, but considering my current availability, it might still take around one week to implement.

rsubtil avatar Jun 07 '24 09:06 rsubtil

Well, this is not going to be merged before 4.4, so you have some time to improve it.

KoBeWi avatar Jun 07 '24 10:06 KoBeWi

Fixed; I've removed the safe mode button's borders since it's implicitly styled already :+1:

image

rsubtil avatar Jun 09 '24 11:06 rsubtil

@bruvzg I agree with you, but there is actually no need for this button anymore once the crash detection mechanism is implemented, since it should be able to reliably detect and prompt for this mode only when it's relevant, which also makes for a better UX.

rsubtil avatar Jun 10 '24 13:06 rsubtil

Only doing it after the crash won't cover all uses cases, broken editor addon can mess up the editor to unusable state but not crash and allow closing it normally.

While I agree that having a lock file and auto asking for safe mode after crash is good (how would it tell apart multiple instances from the crash?). But keeping it in the UI is useful as well, maybe we should have a single custom button with two pressable areas, for normal edit and "down arrow" part for context menu with extra safe mode option.

bruvzg avatar Jun 10 '24 14:06 bruvzg

Only doing it after the crash won't cover all uses cases, broken editor addon can mess up the editor to unusable state but not crash and allow closing it normally.

There are plenty of ways for the engine to crash due to bad user code. But this feature is not meant as an effective tool to help in all cases. With plugins, for instance, a plugin could crash only when some specific behavior is triggered, or only after 3~4 seconds, or randomly due to a data race with multi-threaded work, etc...

This feature is focused on the scenario where users lost the ability to open a project at all, and it requires some form of external intervention to recover the project. So, more specifically, for any crashes/hangs that occur during the engine initialization, before any interaction is possible.

(how would it tell apart multiple instances from the crash?)

It currently cannot tell the difference. The current implementation I'm doing only detects the presence of the lock file to prompt the safe mode. So, in practice, it prompts it after only one initialization crash occurred.

maybe we should have a single custom button with two pressable areas, for normal edit and "down arrow" part for context menu with extra safe mode option.

I like this idea a lot! I made a mockup of it, something like this? image

rsubtil avatar Jun 10 '24 15:06 rsubtil

I've finish up the auto-detection mechanism and rework to the project manager interface. Will update the original PR description with these new changes.

EDIT: Done. The gist of the new changes are essentially:

  • The auto-detection menchanism was implemented with a lock file at each project's user://.safe_mode_lock. It is removed if the engine opens successfully, and if not, used in the next launch to detect a failure.
  • The project manager's Edit button was reworked to have an additional options button next to it, where it is possible to manually trigger the safe mode for any project.

rsubtil avatar Jul 21 '24 10:07 rsubtil

Fixed a merge conflict that appeared in the meantime. But, since master updated to 4.4, this revealed a bug with the auto-detection mechanism: if a warning existed when opening a project (e.g. "Do you wanna upgrade the project to Godot 4.4"), this would bypass the check for safe mode.

I've changed the code to divide this in two steps: the check_safe_mode exclusively to check the safe mode, and the check_warnings for the current warnings. And then modified the code to call the correct instances depending on the current scenario. Now, if a project has both warning types, it first displays the safe mode popup, then the warnings popup before launching.

rsubtil avatar Sep 07 '24 15:09 rsubtil

Remember headless mode has no prompts and needs to work with integration and deployment

fire avatar Sep 07 '24 15:09 fire

Remember headless mode has no prompts and needs to work with integration and deployment

The detection mechanism is implemented only on the project manager; to trigger this mode from CLI, one has to pass the --safe-mode flag explicitly.

rsubtil avatar Sep 07 '24 15:09 rsubtil

Just a quick update on the review status: I'd like to do a proper review myself, that's a feature I've wanted for a while so I want to make sure it works well.

Will also need a rebase before merge.

akien-mga avatar Sep 24 '24 10:09 akien-mga

Thanks, I didn't notice another merge conflict arose in the meantime; I'll fix it, possibly today

rsubtil avatar Sep 24 '24 10:09 rsubtil

Merge conflict has now been fixed, but it wasn't as simple as I hoped; it came from https://github.com/godotengine/godot/pull/96861, @KoBeWi please have a look to ensure I didn't break anything :sweat_smile:

rsubtil avatar Sep 26 '24 20:09 rsubtil

You renamed _open_selected_projects_with_migration method

KoBeWi avatar Sep 26 '24 20:09 KoBeWi

You renamed _open_selected_projects_with_migration method

Yes, that was intended; I renamed _open_selected_project_ask to _open_selected_project_check_warnings to not only make it's functionality clearer, but also because there's now multiple possible steps when opening a project (check warnings, check safe mode).

Since you're doing something similar, I saw fit to also rename it to _open_selected_project_check_migration.

rsubtil avatar Sep 26 '24 20:09 rsubtil

Since you're doing something similar, I saw fit to also rename it to _open_selected_project_check_migration.

But the function performs the migration. The checks are technically done before the info dialog appears.

KoBeWi avatar Sep 26 '24 21:09 KoBeWi

But the function performs the migration. The checks are technically done before the info dialog appears.

Ok yeah, you're right, I've reverted this name change.

rsubtil avatar Sep 27 '24 17:09 rsubtil

I rebased to solve merge conflicts. I'll try to find a bit of time to finally test the feature and approve. Also CC @bruvzg if you want to do another review pass.

akien-mga avatar Nov 13 '24 10:11 akien-mga

Tested briefly with the MRP of #99001 which causes a crash on start from a tool script, and with this PR it's properly flagged as requiring Safe Mode, which works well to open the project and then manually fix the tool script.

Also tested with a 4.1 project where I'd install godot-jolt 0.8.0 (known to crash on 4.2+), which crashes when opened with master / this PR, and Safe Mode properly works it around.

One minor nitpick, when you manually select Edit in Safe Mode from the Project Manager for any project, it shows the same dialog as if that project did crash previously, so it starts with: image

This isn't accurate if selecting Safe Mode manually on projects that didn't crash.

akien-mga avatar Nov 13 '24 14:11 akien-mga

One minor nitpick, when you manually select Edit in Safe Mode from the Project Manager for any project, it shows the same dialog as if that project did crash previously, so it starts with:

Good catch! I'll hide that part of the message when this is triggered manually then :+1:

rsubtil avatar Nov 13 '24 19:11 rsubtil

So, in order to remove the first part of the text when the dialog is manually triggered, I created two separate nodes to simply hide the first one depending on that.

And, since now I'm using nodes instead of popup's set_text, I experimented with using a RichTextLabel for the details part, in order to properly render the bullet points: image

What do you think?

rsubtil avatar Nov 16 '24 13:11 rsubtil

Sorry for being very late to the party, I think the term safe mode is very misleading, and gives a false sense of security.

I suggest to rename the option as recovery mode since the goal is to make "crashing projects" recoverable, and not to create a secure environment.

Faless avatar Nov 30 '24 13:11 Faless

I suggest to rename the option as recovery mode since the goal is to make "crashing projects" recoverable, and not to create a secure environment.

Huum I agree, it does a better job at conveying that this mode is mostly intended as a "rescue" mode, not as a slightly less functional but still usable mode for development. I'll change the naming and fix the merge conflict that appeared in the meantime :+1:

rsubtil avatar Dec 01 '24 19:12 rsubtil

I've made some significant changes this time, so here's a summary (the original description will be updated shortly too):

  • Renamed the feature from "safe mode" to "recovery mode" as suggested on https://github.com/godotengine/godot/pull/92563#issuecomment-2508968464
  • Changed mode description text to use proper bullet points (•) similarly to other cases in the project already
  • Extracted all logic for determining the user directory to a generic implementation on os.cpp (due to https://github.com/godotengine/godot/pull/92563#discussion_r1839976559).
    • Affected OSes (Windows, Linux, Web) were adapted, while others are not affected since they implement custom behavior (macOS, Android, iOS)

And some minor changes I took the liberty of doing as well:

  • Disabled the newly added "Game" tab as well during this mode, since it is not possible to run the project.
  • Changed the error codes for GDExtension and EditorNode from OK to invalid ones.

rsubtil avatar Dec 07 '24 12:12 rsubtil

I suggest ignoring --recovery-mode if the editor isn't running (i.e. ignore it in the game and project manager).

Thanks for testing @Calinou, I didn't remember to check for this flag to only be applied when the editor is run. I've added a check as suggested to prevent this.

rsubtil avatar Jan 03 '25 11:01 rsubtil