beets icon indicating copy to clipboard operation
beets copied to clipboard

Discogs polluting mb_albumid

Open a9y opened this issue 10 years ago • 12 comments

The discogs plugin sets album_id of the AlbumInfo class to a discogs identifier, which beets.autotag.apply_metadata assigns to mb_albumid. The problem is that everywhere else it's assumed to be a MusicBrainz identifier.

a9y avatar Mar 11 '14 22:03 a9y

This is a great point; reusing the MBID fields is an abuse that can cause weird problems. We should instead set a flexible attribute. Alas, we currently don't have a mechanism for setting flexible attributes from metadata matches, but we should definitely add this. (In fact, doing so could simplify the AlbumInfo/TrackInfo types and related construction/application logic.)

sampsyo avatar Mar 11 '14 22:03 sampsyo

I played around with a few solutions, but none were ideal.

The first was to store data_source as a flexible attribute, and then rename the mb_XXX fields to something more abstract like ds_XXX. The user then knows to check the value of data_source before working with those fields. The downside here is that it would require a database migration, and any external plugins that used those fields would stop working.

The second idea was to change AlbumInfo and TrackInfo to look more like this (not tested):

class AlbumInfo(object):
    def __init__(self, **kwargs):
        values = {}
        values.update(kwargs)
        object.__setattr__(self, 'values', values)

    def __getattr__(self, item):
        return self.values.get(item, None)

    def __setattr__(self, key, value):
        self.values[key] = value

In this scenario the data-source plugins could set their own fields (to be stored as flexible attributes) when instantiating AlbumInfo. Discogs could set di_albumid, di_trackid, ... Beatport could set bp_albumid, bp_trackid, ...

The advantage here is that you're not limited to just the one data-source.

a9y avatar Mar 13 '14 00:03 a9y

Absolutely; the second approach is definitely the way to go! The AlbumInfo and TrackInfo objects should be able to carry along arbitrary field settings that are then applied in apply_metadata. The only trick is that they should also have some fields that are not just dict keys because some fields are required or need special handling—see the apply_metadata code for those fields.

sampsyo avatar Mar 13 '14 04:03 sampsyo

I just wanted to add that this is also true for the artist_id. If an album is identified via discogs, it will be tagged with discogs' artist-id. I stumbled upon this because when I import my library into Ampache, it will frequently have the same artist twice - one with the Musicbrainz id, and one with Discogs', and I have to manually fix that.

dengste avatar Aug 29 '15 06:08 dengste

It's also worth noting that the Dicogs data source does not supply mb_trackid information, so (by default) the Duplicates plugin will flag anything that shares null entries for this field as duplicate. Meaning, all Discogs-tagged tracks will be flagged as a duplicate.

The workaround is to additionally add a third parameter to the Duplicate plugin to lastly analyze Title.

skupjoe avatar Feb 23 '17 02:02 skupjoe

@sampsyo Asked me to leave my thoughts here. #2763

My suggestion is to create the following fields for Discogs (TXXX is customizable so that wouldn't be a problem).

Old New
TXXX:MusicBrainz Album Artist Id=1560798 TXXX:Discogs Album Artist Id=1560798
TXXX:MusicBrainz Album Id=3175577 TXXX:Discogs Album Id=3175577
TXXX:MusicBrainz Album Release Country=Netherlands TXXX:Discogs Album Release Country=Netherlands
TXXX:MusicBrainz Album Type=Album TXXX:Discogs Album Type=Album
TXXX:MusicBrainz Artist Id=1560798 TXXX:Discogs Artist Id=1560798

nicolahinssen avatar Dec 19 '17 15:12 nicolahinssen

Agreed with @nicolahinssen on adding special fields for Discogs. This is a feature I would love to see.

More generally, it will be great to store metadata from other sources (Bandcamp, Beatport, etc.) in the same way. The plugins retrieving the metadata should be able to define special fields themselves. Probably there won't be a consensus on naming of the fields, therefore this naming could be configurable by a user.

dbogdanov avatar Apr 04 '18 12:04 dbogdanov

The only trick is that they should also have some fields that are not just dict keys because some fields are required or need special handling—see the apply_metadata code for those fields.

Should then all AlbumInfo and TrackInfo fields be kept, while adding a dict too?

dbogdanov avatar Apr 29 '18 18:04 dbogdanov

I would actually be in favor of abandoning all the hand-coded attributes and just moving over complete to an unstructured dict-like object. It could simplify changes in the future quite a bit, and I don't think we get anything out of the current approach to the built-in set of fields.

sampsyo avatar Apr 29 '18 22:04 sampsyo

This is causing issues with other software (funkwhale) relying on the MB UUID format. They expect the fields to be valid UUIDs, not integers:

image

bastidest avatar May 11 '20 08:05 bastidest

Hello, It seems this bug is still here. It has weird side effects: I discovered it after digging why absubmit fails for almost my whole library. The explanation is absubmit needs an UUID to build the submission URL. I guess it affects a few other plug-ins.

Maybe the simplest solution would be setting flexible attributes, as previously suggested

Discogs could set di_albumid, di_trackid, ... Beatport could set bp_albumid, bp_trackid, ...

?

martinkirch avatar May 11 '21 21:05 martinkirch

Another use case that I'm running into is tagging an audiobook collection with metadata from Audible. Because of the hard-coded fields, I don't think its possible for me to store additional metadata that is not already defined in one of the existing fields.

Wondering if there are any plans to fix this in the near / mid term?

Edit: I realized my comment may not fit this issue, perhaps I should move it to #2866 instead.

Neurrone avatar Feb 03 '22 14:02 Neurrone