beets icon indicating copy to clipboard operation
beets copied to clipboard

Safely handle metadata plugin exceptions.

Open semohr opened this issue 4 months ago • 18 comments

Description

When a metadata plugin raises an exception during the auto-tagger process, the entire operation crashes. This behavior is not desirable, since metadata lookups can legitimately fail for various reasons (e.g., temporary API downtime, network issues, or offline usage).

This PR introduces a safeguard by adding general exception handling around metadata plugin calls. Instead of causing the whole process to fail, exceptions from individual plugins are now caught and logged. This ensures that the auto-tagger continues to function with the remaining available metadata sources. I used a proxy pattern here as this seems like an elegant solution to me.

This replaces the efforts from #5910

Decisions needed

How do we want to name the configuration option which controls if an error should be propagated and also where/how do we want this to be documented?

Todos:

  • [x] Changelog.
  • [ ] Documetnation.

semohr avatar Aug 26 '25 16:08 semohr

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

github-actions[bot] avatar Aug 26 '25 16:08 github-actions[bot]

Codecov Report

:x: Patch coverage is 89.36170% with 5 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 67.42%. Comparing base (29b9958) to head (a69f575). :white_check_mark: All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beets/metadata_plugins.py 89.36% 2 Missing and 3 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5965      +/-   ##
==========================================
+ Coverage   67.34%   67.42%   +0.08%     
==========================================
  Files         136      136              
  Lines       18492    18534      +42     
  Branches     3134     3137       +3     
==========================================
+ Hits        12453    12497      +44     
+ Misses       5370     5365       -5     
- Partials      669      672       +3     
Files with missing lines Coverage Δ
beets/metadata_plugins.py 84.07% <89.36%> (+7.55%) :arrow_up:

... and 1 file with indirect coverage changes

:rocket: New features to boost your workflow:
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Aug 26 '25 17:08 codecov[bot]

Related:

  • #5553 will benefit from this
  • #4789 will not crash anymore but be skipped
  • #5903 Network errors will log warning but the import will continue

semohr avatar Aug 26 '25 19:08 semohr

Reviewer's Guide

This PR implements robust exception handling around metadata plugin calls by introducing safe-call and safe-yield helpers with logging, updating all plugin invocations to use these wrappers, refining related type signatures, and adding tests and a changelog entry to validate and document the new behavior.

Flow diagram for exception handling in plugin calls

flowchart TD
A["AutoTagger calls plugin method"] --> B{Exception raised?}
B -- Yes --> C["Log error"]
C --> D["Continue to next plugin"]
B -- No --> E["Process plugin result"]
E --> D

File-Level Changes

Change Details Files
Introduce safe-call and safe-yield wrappers for plugin invocations
  • Add global logger instantiation
  • Define _safe_call, _safe_yield_from, and _class_name_from_method helpers
  • Wrap all metadata plugin calls (candidates, item_candidates, album_for_id, track_for_id, track_distance, album_distance) with the new wrappers
beets/metadata_plugins.py
Refine function signatures and generics for metadata methods
  • Import ParamSpec and TypeVar and reorganize autorestag imports
  • Change candidates return type to Iterator
  • Switch batch lookup parameters from Sequence to Iterable
  • Rename generic type parameter from R to Res for IDResponse
beets/metadata_plugins.py
Update changelog
  • Add entry describing the new exception-handling behavior
docs/changelog.rst
Add tests for plugin exception handling
  • Create test_metadata_plugins.py with a mock plugin that raises errors
  • Verify that errors are caught and logged without crashing the process
test/test_metadata_plugins.py

Possibly linked issues

  • #1: PR introduces general exception handling for metadata plugins, directly resolving the issue of import process abortion due to network errors.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar Sep 09 '25 13:09 sourcery-ai[bot]

It does feel like we should pause the import until the user has acknowledged the issue (maybe providing options to skip importing the album or continue with existing metadata?), as otherwise you could end up with missing metadata that is awkward to fix later if for example you had network issues or the provider was down.

Not sure how this should interact with --quiet and --quiet-fallback though.

jackwilsdon avatar Sep 10 '25 09:09 jackwilsdon

A prompt makes sense in theory, but I'm really hesitant to add any user interaction because it breaks headless scripts, automation, and make monkey-patching way harder which is a big part of my beets use case. Instead, I'm leaning towards adding a simple config option that lets you tell the plugin to just fail hard if it runs into trouble, so your import would stop and you'd know right away.

On a practical level, I'm also not convinced that skipping these errors is a major issue in the first place. The plugin's job is to suggest candidates, and it's already robust in how it handles failure: if it can't find any candidates, it just doesn't add any, and the import continues with other plugins. Same goes for get_by_id, it's designed to return None, so the higher-level functions are already built to handle that. Imo the introduced behavior should be good enough for most users.

semohr avatar Sep 10 '25 10:09 semohr

Im not too happy with the duplicate _safe_call and _safe_yield functions. I would be happy to get another idea here on how to make this a bit more convenient and maintainable.

We also need a config option to re-enable to old behavior before I'm comfortable enough to merge this.

semohr avatar Nov 03 '25 13:11 semohr

@henry-oberholtzer I rethought my approach here a bit and opted to use a proxy pattern instead. Seems a bit more clear and maintainable imo.

semohr avatar Nov 03 '25 16:11 semohr

@snejus Would you mind having a look here too? I would be happy to know your thoughts on this. Im pretty happy with the general implementation now and can finalize this if we are happy with the approach.

I like the proxy approach (while a bit complex) it is very unintrusive, only hooking into our existing logic at one place.

semohr avatar Nov 06 '25 10:11 semohr

I'm not the most familiar with proxies, but this looks good to me, same logic, neater application.

henry-oberholtzer avatar Nov 07 '25 22:11 henry-oberholtzer

@semohr As far as I remember this was initially implemented using decorator pattern. Why did you switch to the proxy pattern? I'm most likely missing context, but, I think, the decorator pattern would allow this to be type-checked?

snejus avatar Nov 11 '25 14:11 snejus

Typechecking should work either way with the small mypy hack.

Here’s a brief recap of my reasoning:

One challenge with using a decorator is that it needs to be applied to each plugin individually. Simply wrapping a main function like def album_for_id(...) isn’t sufficient, because it would prevent other plugins from continuing the auto-tagging process if an error occurs. I initially tried something like:

if info := _safe_call(plugin.album_for_id)(_id):
    ...

…but I found this approach somewhat intrusive and not very Pythonic. It is needed in every plugin wrapper call.

Another complication arises from using yield form, which requires additional decorator logic to handle the generator (I used _safe_yield_from here). This approach would end up duplicating a lot of the decorator code. In theory, we could combine the “yield” and “call” decorators, but that would require overload typing and fairly convoluted inspection logic to determine whether the return value supports yield from syntax.

Given this, I decided to try a proxy pattern and am a bit more happy with it.

semohr avatar Nov 11 '25 15:11 semohr

One challenge with using a decorator is that it needs to be applied to each plugin individually

Can we not use the metadata_plugins.candidates, metadata_plugins.item_candidates etc for this?

snejus avatar Nov 12 '25 00:11 snejus

Im not too sure what you mean. The methods are overwritten by each plugins implementation as they are abstract. You cant apply the decorator directly, you always need to wrap the functions dynamically.

As mentioned in the comment above something like this should work:

def item_candidates(*args,**kwargs):
    For plugin in find_plugins():
         yield from _safe_yield(plugin.item_candidates)(*args, **kwargs)

semohr avatar Nov 12 '25 09:11 semohr

I'm thinking something in these lines indeed

def _safe_call(
    plugin: MetadataSourcePlugin,
    method_name: str,
    *args,
    **kwargs,
):
    """Safely call a plugin method, catching and logging exceptions."""
    try:
        method = getattr(plugin, method_name)
        return method(*args, **kwargs)
    except Exception as e:
        log.error(
            "Error in '{}.{}': {}",
            plugin.data_source,
            method_name,
            e,
        )
        log.debug("Exception details:", exc_info=True)
        return None


@notify_info_yielded("albuminfo_received")
def candidates(*args, **kwargs) -> Iterable[AlbumInfo]:
    """Return matching album candidates from all metadata source plugins."""
    for plugin in find_metadata_source_plugins():
        if result := _safe_call(plugin, "candidates", *args, **kwargs):
            yield from result

Another question: does raise_an_error configuration provide any value? I'd be happy to always handle these exceptions.

snejus avatar Nov 12 '25 09:11 snejus

This is mostly how i had it before. This does not catch the errors during the yield and handling this properly requires a bit more work. Also typing is stripped here.

I think retaining the initial behavior is wanted, sometimes one doesn't want to continue if an error occurs as some crucial lookup (for some) might be missing. See also https://github.com/beetbox/beets/pull/5965#issuecomment-3274063030

semohr avatar Nov 12 '25 10:11 semohr

This is mostly how i had it before. This does not catch the errors during the yield and handling this properly requires a bit more work. Also typing is stripped here.

A pure decorator-based approach may be the way to go forward, then:

@contextmanager
def handle_plugin_error(method_name: str):
    """Safely call a plugin method, catching and logging exceptions."""
    try:
        yield
    except Exception as e:
        log.error("Error in '{}': {}", method_name, e)
        log.debug("Exception details:", exc_info=True)


def safe_yield(func: IterF[P, Ret]) -> IterF[P, Ret]:
    @wraps(func)
    def wrapper(*args: P.args, **kwargs: P.kwargs) -> Iterable[Ret]:
        with handle_plugin_error(func.__qualname__):
            yield from func(*args, **kwargs)

    return wrapper


@notify_info_yielded("albuminfo_received")
@safe_yield
def candidates(*args, **kwargs) -> Iterable[AlbumInfo]:
    """Return matching album candidates from all metadata source plugins."""
    for plugin in find_metadata_source_plugins():
        yield from plugin.candidates(*args, **kwargs)

To make sure this decorator is suitable for album_for_id and track_for_id, consider using plugin.albums_for_ids and plugin.tracks_for_ids inside those functions. This way, you can also wrap them up notify_info_yielded accordingly.

I think retaining the initial behavior is wanted, sometimes one doesn't want to continue if an error occurs as some crucial lookup (for some) might be missing. See also #5965 (comment)

Presumably, this depends on us handling errors somewhere? Otherwise, right now, an unhandled plugin error kills beets execution.

snejus avatar Nov 15 '25 14:11 snejus

Presumably, this depends on us handling errors somewhere? Otherwise, right now, an unhandled plugin error kills beets execution.

This seems like the desired behavior in that case for me. E.g. I want the process to exit completely if musicbrainz is currently down.

To make sure this decorator is suitable for album_for_id and track_for_id, consider using plugin.albums_for_ids and plugin.tracks_for_ids inside those functions. This way, you can also wrap them up notify_info_yielded accordingly.

The issue with that approach is that it does not log the plugin name correctly (here the method_name would be metada_plugins.candidates). Imo it is quite important for the error message to include the actual name of the plugin which has issues. I have really tried quite a few approaches here and this will even get more complex ;)

I can try to restore my initial decorator based approach but it is way more complex and more intrusive. I think the proxy based approach is more maintainable.

semohr avatar Nov 17 '25 11:11 semohr