beets icon indicating copy to clipboard operation
beets copied to clipboard

Support multiple pseudo-releases and reimport in mbpseudo

Open asardaes opened this issue 3 months ago • 1 comments

Description

Hi again, today I tried to reimport an album to see if I could get the data from its pseudo-release and I realized there were a couple of things that could be improved.

First, determination of the best ref in PseudoAlbumInfo was not triggered during reimport because the flow is slightly different. To fix that I added a plugin method before_album_info_emitted that is explicitly called for metadata plugins. ~figured I could use the albuminfo_received event if I modified it slightly to add the Iterable[Item] to the event. I feel like this is ok because those 2 parameters are closely related, but let me know what you think; it seems no other plugin was using the event so far.~ See also my comment below.

Second, multiple pseudo-releases (example) were not really supported since the plugin always took the first one from the list, so I adjusted the code to compute some priority depending on the import languages from the config file.

Example where jp was first in the language list:

Finding tags for album "YOASOBI - 夜に駆ける".
  Candidates:
  1. (97.6%) YOASOBI - Yoru ni Kakeru
             ≠ media
             MusicBrainz, Digital Media, 2019, XW, None, YOASOBI-001, None
  2. (97.6%) YOASOBI - Racing Into The Night
             ≠ media
             MusicBrainz, Digital Media, 2019, XW, None, YOASOBI-001, None
  3. (97.6%) YOASOBI - 夜に駆ける
             ≠ media
             MusicBrainz, Digital Media, 2019, XW, None, YOASOBI-001, None
  • [x] Documentation. (If you've added a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
  • [x] Changelog. (Add an entry to docs/changelog.rst to the bottom of one of the lists near the top of the document.)
  • [x] Tests. (Very much encouraged but not strictly required.)

asardaes avatar Nov 15 '25 00:11 asardaes

Codecov Report

:x: Patch coverage is 66.23377% with 26 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 67.92%. Comparing base (2bd77b9) to head (800eef8). :white_check_mark: All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/mbpseudo.py 70.14% 16 Missing and 4 partials :warning:
beets/metadata_plugins.py 42.85% 4 Missing :warning:
beets/autotag/match.py 50.00% 0 Missing and 1 partial :warning:
beetsplug/missing.py 0.00% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6163      +/-   ##
==========================================
- Coverage   67.93%   67.92%   -0.01%     
==========================================
  Files         137      137              
  Lines       18677    18730      +53     
  Branches     3155     3166      +11     
==========================================
+ Hits        12688    12723      +35     
- Misses       5324     5338      +14     
- Partials      665      669       +4     
Files with missing lines Coverage Δ
beetsplug/mbsync.py 81.81% <ø> (ø)
beets/autotag/match.py 76.92% <50.00%> (ø)
beetsplug/missing.py 33.33% <0.00%> (ø)
beets/metadata_plugins.py 74.78% <42.85%> (-1.74%) :arrow_down:
beetsplug/mbpseudo.py 77.25% <70.14%> (-2.38%) :arrow_down:
:rocket: New features to boost your workflow:
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Nov 15 '25 00:11 codecov[bot]

@asardaes I'm finding that the way this plugin is designed is causing issues in

  1. https://github.com/beetbox/beets/pull/6052 where I'm struggling to rebase this work on master because this plugin is so tightly coupled to musicbrainz but is defined as a separate plugin
  2. The way it dynamically adjusts AlbumMatch prevents proper refactoring of tagging functionality, see https://github.com/beetbox/beets/pull/6184.

Therefore, it needs to be restructured. From what I've seen, it seems that this functionality should be part of the existing musicbrainz plugin (and toggled/enabled using the configuration). We should probably consider creating a musicbrainz directory under beetsplug to organise functionality related to MB more effectively:

beetsplug/
-├── musicbrainz.py
+└── musicbrainz/
+    ├── mbpseudo.py
+    └── musicbrainz.py

I already started working on this and will get on it next once I'm finished with #6052, so for now please do not make significant changes in this plugin, thanks!


If you're interested I asked AI to review this plugin architecture:

Key architectural concerns in mbpseudo
  • Mutually exclusive design: _on_plugins_loaded raises if the standard musicbrainz plugin is enabled, meaning mbpseudo reimplements large chunks of MusicBrainz logic instead of composing with, or extending, the existing plugin. This tight coupling makes maintenance difficult (any upstream change must be duplicated) and prevents users from mixing behaviors that might otherwise be complementary.

  • Deep reach into MusicBrainz internals: The plugin calls private helpers like _merge_pseudo_and_actual_album and _preferred_alias, taps musicbrainzngs.get_release_by_id directly, and overrides both candidates() and album_info(). Because it relies on internal, non-public functions and data structures, even minor upstream refactors can break mbpseudo. A more stable architecture would expose an explicit hook or shared service for "alternate releases" rather than patching the full candidate pipeline.

  • Complex stateful AlbumInfo wrapper: PseudoAlbumInfo overrides __getattr__, toggles self.__dict__['_pseudo_source'], mutates mappings after album_matched, and implements custom deepcopy logic. This "dual view" approach works but complicates reasoning: downstream code sees one object whose semantics change depending on hidden flags, which increases the risk of subtle bugs (e.g., caching or serialization issues). A cleaner design would keep pseudo and official releases as separate AlbumInfo instances with explicit metadata about preference rather than switching identity midstream.

  • Configuration/model mismatch: The plugin inserts custom MediaFields for every configured tag at runtime, even though they may already exist or conflict. The custom_tags_only switch branches behavior inside album_info, mutating album or track fields in place. This suggests the architecture should separate "metadata decoration" from "candidate selection" so each concern can evolve independently (e.g., use hook-based transformers to add tags, and a separate strategy object for pseudo-release selection).

Overall, the plugin achieves its goal but does so by tightly grafting onto MusicBrainz internals and using stateful wrappers. A more modular architecture, sharing infrastructural hooks with musicbrainz, exposing a dedicated pseudo-release provider interface, and avoiding mutable proxy objects, would make the feature easier to reason about, test, and maintain.

snejus avatar Dec 19 '25 02:12 snejus

Hi @snejus, what would need to change for #6052? The only call made directly with musicbrainzngs is somewhat abstracted by:

self._release_getter = musicbrainzngs.get_release_by_id

so I would have expected it's easy to swap that if necessary.

As for the other issue, I think it might be fine trying to just merge everything into the main musicbrainz plugin, but I don't know if that's enough to ease the issues with AlbumMatch. Let me know where the issue is, I can try to improve that too.

asardaes avatar Dec 19 '25 08:12 asardaes