godot icon indicating copy to clipboard operation
godot copied to clipboard

Add user-facing log interception callback.

Open zorbathut opened this issue 1 year ago • 40 comments

I wanted a way to catch all Godot messages, including errors, so they can be displayed to the user and/or handled in code. (If you're saying "why do you want that", read the next comment!) Not only should this handle all messages after the callback is applied, I wanted to handle all messages before the callback is applied so that startup warnings or errors are reported as well.

This provides a LogManager singleton that can have a callback attached. It saves messages until Frame 1, meaning that if you apply your callback during Frame 0, it quickly replays its entire saved log for you.

This used to be https://github.com/godotengine/godot/pull/68169 but takes a completely different approach at solving the same underlying issue. For my purposes, this is far better, and it's also supported on all scripting languages.

Personal concerns with this solution:

  • ~~It imposes a bit of CPU overhead and memory churn on every log message. I don't think this is a big issue because there's already a ton of that, but this does impose further costs.~~ Essentially solved; it still imposes a function call, std::atomic lookup, and conditional, but I don't think that's a big problem.
  • If a project spews a ton of log messages out before the first frame, then never registers a log handler, all the work done to handle that is wasted. It's unclear how this can be handled; it's also probably not a big deal because projects generally shouldn't be outputting that much on the first frame.
  • ~~The callback must be threadsafe, but will generally work pretty much fine even if it's not threadsafe. This is a perfect recipe for introducing subtle threading issues in people's code. I'm not sure how to fix this - providing the message immediately is important to me, and I haven't come up with a way to coherently single-thread that process without introducing performance issues and long message delays.~~ Solved by splitting the callback into Nonthreadsafe and Buffered versions.
  • ~~There is currently no way to attach more than one callback, nor is there a way to remove a callback.~~ Solved.
  • ~~If multiple callbacks were supported, the current design would either mean that the second callback hook didn't get the full debug log replay, or it would require some further engineering work to support that.~~ Solved.
  • If the error is caused by a C# exception, and you're setting a callback in C#, then this doesn't provide access to the actual C# exception object. It could! I don't personally need this functionality but it's a pretty obvious thing to add later if there's demand.
  • The whole "pass data as a dictionary of variants" approach is pretty gnarly. It's something used in a lot of places already - see PhysicsDirectSpaceState3D.get_rest_info - but I still don't like it. Still, when in Rome.

Things that do in fact need to be fixed before commit:

  • ~~The documentation is lacking. (I'll fix this if the design is approved in general.)~~ Solved.

Edit: Related feature proposals:

  • https://github.com/godotengine/godot-proposals/issues/1896

  • https://github.com/godotengine/godot-proposals/issues/3062

  • https://github.com/godotengine/godot-proposals/issues/8964

  • Bugsquad edit, closes: https://github.com/godotengine/godot-proposals/issues/8964

zorbathut avatar Jan 25 '24 11:01 zorbathut

Why do you want to record log messages in-game, anyway?

A bunch of answers here.

First, my general experience is that in every engine, errors regularly get unnoticed by people. People don't look at the error log unless they already know there's something posted, and people never expect an error. I've seen this happen in Unity, Unreal, Godot, and custom engines; it's just a universal truth and I've given up convincing people to look at the error log on their own, they just don't do it.

So this is being used for a system that pops up an error dialog when an error shows up. It pops up in the game viewport, which makes it impossible to ignore (the other effective solution I've seen is a modal popup.) Example:

image

The funny part is that when I originally implemented this, I realized this particular error had been happening for a few days and I hadn't even noticed it while working on the error reporting system.

Second, testers need to be aware of errors also, and will often be testing on the fully-built game (otherwise you're just begging for release-only errors, which will happen.) They need some visible error log, and that means I need to be able to reliably record errors to show ingame.

Third, stuff like Sentry is very convenient for reporting errors to a central server. But you still need a way to get those errors, and right now there's no way besides modifying the engine code. This is a general solution for retrieving all messages so you can do whatever you like with them.

Fourth, I'm setting this system up for mod support. Modders don't have the full game engine tools available either, and they still need access to errors. In addition, their players need access to errors. Note that all of this is modeled off Rimworld's Developer Mode, which is considered a critical feature for mod developers:

image

The end result of all of this is that in-game error reporting is really really handy.

zorbathut avatar Jan 25 '24 11:01 zorbathut

Please open a feature proposal to track this feature and the demand/ideas 🙂

AThousandShips avatar Jan 25 '24 11:01 AThousandShips

@zorbathut thank you for the ping! I agree with everything you wrote. Can't speak for the code unfortunately. Will try to understand how to test this PR, it's a first for me, but i think this is very important, so I'll try to put in the effort

@AThousandShips

Please open a feature proposal to track this feature and the demand/ideas 🙂

There are already some existing proposals afaik: a possible candidate is https://github.com/godotengine/godot-proposals/issues/1896 or maybe https://github.com/godotengine/godot-proposals/issues/3062

farfalk avatar Jan 25 '24 11:01 farfalk

Would have noticed them if they were listed in the original PR 🙂, or if OP had linked to them instead

AThousandShips avatar Jan 25 '24 11:01 AThousandShips

Would have noticed them if they were listed in the original PR 🙂, or if OP had linked to them instead

No worries! Luckily we're a nice community, so we can compensate for each other's shortcomings 💪

farfalk avatar Jan 25 '24 11:01 farfalk

I did look through the proposals for anything that matched but didn't see it through skimming - unfortunately there aren't great keywords, "log" gives a huge number of results and I guess I didn't choose any better keywords. But thanks for the links, though, at least someone managed to find them :D

zorbathut avatar Jan 25 '24 11:01 zorbathut

I did look through the proposals for anything that matched but didn't see it through skimming - unfortunately there aren't great keywords, "log" gives a huge number of results and I guess I didn't choose any better keywords. But thanks for the links, though, at least someone managed to find them :D

Yeah well, i've been lurking around this specific logging issue for a while, so I remember the most promising proposals and PRs by heart 😅

Maybe could be useful to cross-reference this with some or all the proposals I linked above, with the old PR and with some other related famous ones, like https://github.com/godotengine/godot/pull/56014

farfalk avatar Jan 25 '24 12:01 farfalk

Proposals linked in the intro :)

As much as I'd love to be able to handle stuff like Breakpad, this kinda doesn't - it just handles errors, not crashes, and true crash handling is tough because the environment is - sort of by definition - in a broken state. Trying to enter the .NET runtime might just crash it further. I think practically that would have to be a C++ interface just to minimize the cross-contamination with the rest of the code.

This is also something I might do someday but it may take a while :)

zorbathut avatar Jan 25 '24 12:01 zorbathut

Just want to add to this, that having an in-game log pad / in-game message capture mechanism allows developers to early-report these errors under release builds, without delaying until they actually find some part of the functionality is broken and it is too late to pause for debugging/detail checking.

Delsin-Yu avatar Jan 25 '24 12:01 Delsin-Yu

Just to update on some brainstorming . . .

The callback must be threadsafe, but will generally work pretty much fine even if it's not threadsafe. This is a perfect recipe for introducing subtle threading issues in people's code. I'm not sure how to fix this - providing the message immediately is important to me, and I haven't come up with a way to coherently single-thread that process without introducing performance issues and long message delays.

. . . my current best solution is to provide two ways of registering a callback, one Buffered and one Nonthreadsafe. Either as two separate functions or as one function with a two-value enum as a parameter. If it's registered as Buffered, it just stores up messages until they're dispatched all together in the main thread, if it's registered as Nonthreadsafe it has the current behavior.

In theory, this means people will look at the two options and say "oh, a decision! I should understand this decision" instead of either finding themselves annoyed by buffered-by-default or ending up with bizarre crashes thanks to unintentionally choosing nonthreadsafe-by-default.

Better suggestions appreciated, but this seems like a solution that is at least neither actively dangerous nor actively insufficient.

zorbathut avatar Jan 25 '24 15:01 zorbathut

Are you sure OS::add_logger() is the most appropriate way? I'm not familiar with this area, just wondering if you knew there were other functions for registering print/error handlers:

https://github.com/godotengine/godot/blob/5034478611697358f4e135c69f364ef6d8cd54dc/core/string/print_string.h#L56-L57

https://github.com/godotengine/godot/blob/5034478611697358f4e135c69f364ef6d8cd54dc/core/error/error_macros.h#L60-L61

There is a separate mechanism for the debugger, although this is probably less relevant since it is only applicable in debug builds. Finally there is EditorToaster (there is currently no way to send a toast in scripting, nor to add a handler). It is not necessary to add all at once, but it is worth considering in terms of compatibility/consistency for follow-up PRs.

Also, I'm not sure if there are differences between the Output Log and the Debugger > Errors tab, and there may be other logging/error/notification systems that I'm not aware of. I can't judge the implementation, just trying to add useful input.

dalexeev avatar Jan 25 '24 16:01 dalexeev

It will not surprise me at all if I'm missing stuff :)

add_error_handler

This one seems to function as a fallback for printing errors if the OS singleton doesn't yet exist. Right now I'm starting right after the OS singleton is initialized, so it's not necessary, but by using this I might be able to buffer even earlier.

I'm tempted to look into this one, but it's also going to be a pain because with the current API I'll have to detach the error_handler event when I attach to the OS singleton in order to avoid getting double-warnings.

add_print_handler

This actually does route through the OS singleton and would crash if the OS singleton wasn't initialized, so I can be certain I'm not losing anything by not registering this one. However, it's also set up to silence itself if application/run/disable_stdout is enabled. It's unclear to me if it should - it's arguable that, with that enabled, we should be disabling the stdout but not the part that records logs for internal usage. This would all be a pretty big architectural change, plus require some user intent decisions, and is something I'd probably prefer to leave up to a later change (I'm going to solve this for myself by just not turning that flag on :V)

There is a separate mechanism for the debugger, although this is probably less relevant since it is only applicable in debug builds. Finally there is EditorToaster (there is currently no way to send a toast in scripting, nor to add a handler). It is not necessary to add all at once, but it is worth considering in terms of compatibility/consistency for follow-up PRs.

Yep, that sounds reasonable to me too. I kinda-sorta feel like some more unified Single Way To Do Logging might be the right approach, but that's a much heavier-weight refactor than I am currently up for.

That might be the appropriate time to solve the add_error_handler issue, though.

One nice part about the design I currently have is that all of this would work without API changes; type is just a string, so if we end up adding "debugger" and "editorToast" types, that's totally fine. And extending the buffer earlier won't cause problems, it's all just juggling stuff behind the scenes.

zorbathut avatar Jan 25 '24 16:01 zorbathut

I second the callback thing, was going to suggest this myself, at least to explore the option of adding support for registering a custom callback to the existing logging output, it catches everything that currently exists with various data provided

AThousandShips avatar Jan 25 '24 16:01 AThousandShips

Alright, so it looks like the TODO list is:

  • ~~Better documentation~~ Solved.
  • ~~Make the intercept system shut down more thoroughly if nothing is hooked and it's after frame 0~~ Solved.
  • ~~Split the callback into NonThreadSafe and Buffered, implement Buffered~~ Solved.

A lot of the other potential improvements (start recording earlier, record more types of message) wouldn't involve breaking backwards compatibility and can be done later.

Any way I can attract someone with the authority to say "yes, assuming you do this competently we would be happy to integrate it"? Or to provide a different set of changes that need to be made?

zorbathut avatar Jan 26 '24 11:01 zorbathut

Big update time.

I've made the proposed changes and fleshed out the documentation. I also squashed and rebased the entire thing because the history was frankly misleading; if anyone wants to see the original, it's in https://github.com/zorbathut/godot/commits/pr_loghook_old/ .

The threading work is finicky, because threading work is always finicky. I've done my best to document everything, especially the subtle things.

I've tested this a bunch. It is intrinsically fiddly, though, and it won't surprise me if there are things I didn't think of. I can't find any issues but more testing is always appreciated.

As far as I am concerned, this is ready to go (pending review). Note that I considered the last two revisions not ready to go. This revision is ready to go!

zorbathut avatar Feb 17 '24 14:02 zorbathut

If a project spews a ton of log messages out before the first frame, then never registers a log handler, all the work done to handle that is wasted. It's unclear how this can be handled; it's also probably not a big deal because projects generally shouldn't be outputting that much on the first frame.

How noticeable is this if at all? Will it impact --headless or Server apps?

Swarkin avatar Feb 17 '24 22:02 Swarkin

Will it impact --headless or Server apps?

Whoops! Good catch. It would have logged stuff forever and perpetually leaked memory. Fixed now.

How noticeable is this if at all?

I haven't benchmarked it, but for each log message it's basically creating a Dictionary wrapped in a Variant, with (at the worst case) six key/value pairs. And that's two log messages reported at startup, usually. So, maybe thirty small allocations?

I'd expect this to be in the single-microsecond range. And remember this is once at startup, not per-frame. Stuff like "parsing startup config" is going to be orders of magnitude larger.

zorbathut avatar Feb 19 '24 14:02 zorbathut

Changes made! Let me know if there's anything else needed.

We should test this on mobile/web platforms too to make sure there's nothing unexpected popping up, especially as the web platform can run without support for multiple threads now (to avoid the need for cross-origin isolation headers).

I'm not going to have access to mobile dev for a week, but I'll see if I can get web working.

zorbathut avatar Feb 19 '24 23:02 zorbathut

Works on web, at least!

If someone else can tackle mobile it'd be appreciated, otherwise, probably Sunday or Monday.

zorbathut avatar Feb 20 '24 00:02 zorbathut

Works on Android!

I don't have a good way to do iOS testing, unfortunately.

zorbathut avatar Feb 26 '24 11:02 zorbathut

Just for the record, I currently have no pending tasks I'm aware of on this; all requested changes and testing have been completed. I'm basically just waiting for reviewers to have the spare time to check it out.

If reviewers are waiting on me to do something, let me know what :)

zorbathut avatar Feb 29 '24 09:02 zorbathut

(should've done this update a little differently, sorry for the clutter, will have it solved shortly)

zorbathut avatar Feb 29 '24 10:02 zorbathut

If you like I can add as suggestions to make it easier :)

AThousandShips avatar Feb 29 '24 10:02 AThousandShips

Alright, think that's all the pending change requests again :) Not sure why the automated tests haven't kicked off but I'd guess that's either on Github or on your end.

zorbathut avatar Feb 29 '24 10:02 zorbathut

Probably because you made changes in quick succession, unsure, will look

AThousandShips avatar Feb 29 '24 10:02 AThousandShips

Status update: I again have no pending tasks and am waiting entirely on reviewers. Ball's in your court! Let me know if it ends up back in mine :)

zorbathut avatar Mar 05 '24 07:03 zorbathut

Would love to see this included in the upcoming release, looking forward to it, it plugs a major gap in Godot 4.

Thanks for the work @zorbathut

maxrun-games avatar Mar 19 '24 04:03 maxrun-games

Tested and works great in windows platform. Waiting for it to be in next release.

@zorbathut great work. Would be possible to share your code for the Debug Log window?

conde2 avatar Mar 20 '24 00:03 conde2

Is this going to be considered in time for 4.3 stable?

If not, what's blocking it? Are there other logging-related issues or PRs that need to be managed together with this one before it gets the green light?

Just trying to assess some kind of general timeline for the availability of the feature

farfalk avatar Apr 06 '24 07:04 farfalk

@zorbathut great work. Would be possible to share your code for the Debug Log window?

I looked into extracting my code cleanly and determined that it would unfortunately be really hard to do well.

So I did it badly! It's available over here. You will need to do some cleanup in order to make it work; if you do that, please send a pull request, thanks :D

zorbathut avatar Apr 06 '24 10:04 zorbathut