beets icon indicating copy to clipboard operation
beets copied to clipboard

Spotify plugin overwrites the MB IDs even if we have the correct ones

Open arsaboo opened this issue 1 year ago • 6 comments

The Spotify plugin (this may be true for other plugins, but I have not checked them all) currently populates all the fields including the mb_album_id and mb_artist_id with spotify_album_id and spotify_artist_id. This makes sense during the initial import (or when we are only using Spotify as the tag source) when we don't have the MB ids and we need some IDs to be populated. However, if we initially imported the album with MusicBrainz plugin and the MB ids are correctly populated, Spotify (or any other plugin) should not overwrite the correct MB IDs.

Basically, any plugin should check if the MB ID fields are empty (i.e., initial import) and only then populate those fields. If MB IDs are already present, the plugin should only update the plugin-specific fields. Thus, if Spotify tagger is used after the album was already imported using MB, Spotify should only update the Spotify-specific fields and not incorrectly change the MB ids to Spotify IDs.

arsaboo avatar Sep 14 '22 12:09 arsaboo

Hello! I think this is roughly the same problem as #604… namely, we would like non-MB sources to not ever overwrite the MBID fields with non-MBID data. This just requires one trick to be solved: namely, fixing the way we collapse duplicate results. But given a solution there, we could just stop "polluting" this field altogether, as the original issue puts it.

sampsyo avatar Sep 14 '22 23:09 sampsyo

While the bigger issue should be resolved, I actually had something simpler in mind. It may be ok to write the mbid fields during the initial import if non-MB tag source is being used as we use mbid as identifier in beets. However, once we have a MBid (can be checked since they have a specific format), non-mbid taggers should not overwrite them. This will require least disruption. Can we check if mbid is populated and then decide whether to overwrite it or not?

arsaboo avatar Sep 15 '22 02:09 arsaboo

Hmm… that may be possible, but it could be surprisingly tricky. The reason is that backends supply AlbumInfo and TrackInfo objects containing the metadata to apply, and then the beets.autotag module takes care of blindly transferring that data into the actual database. There's not currently a way for a TrackInfo object to say "use this ID for disambiguation, but not actually for applying to the database." It sorta seems like we could resolve the general version of the problem just as easily as that…

sampsyo avatar Sep 15 '22 03:09 sampsyo

It may be fine for the plugins to send all the information in the AlbumInfo or TrackInfo objects if the beets.autotag is able to pick the right information to be applied to the database. So, for example, we could add a check here before updating mb_albumid

https://github.com/beetbox/beets/blob/44a7cc74bfe673bcf960c5dc18716b38e33d5ad0/beets/autotag/init.py#L170

arsaboo avatar Sep 15 '22 13:09 arsaboo

Yes, indeed, that would be the place to do it. But we'd need to plumb through some kind of information that says "this album_id is unreliable; please don't use it for real." At that point, I think the effort is about the same as just fixing the underlying issue—that is, separating the mb_albumid field from a different field that is used solely for disambiguation.

sampsyo avatar Sep 16 '22 00:09 sampsyo

I am not even sure how to handle this, so I am going to leave this issue open for now. But given the number of open issues/questions related to this topic, I sincerely hope that we address this issue sooner than later.

arsaboo avatar Sep 19 '22 15:09 arsaboo