beets icon indicating copy to clipboard operation
beets copied to clipboard

Capture "primary" artist as a separate field

Open dicolanl opened this issue 5 years ago • 32 comments

In beets when you have a CD that is "artist featuring artist" there is no way to name the folder as the primary artist. This creates multiple folders and then they dont get imported into tools like lidarr.

Example Artist - Album - Album Artist As Is: 2 Chainz Featuring Drake - Big Amount - 2 Chainz Featuring Drake Proposed: 2 Chainz Featuring Drake - Big Amount - 2 Chainz

If you look at the tags the albumartist MBid is for 2 Chainz. The MBid for the album is as is the MBid for release-group is as is.

Beets is handling "Album Artist" for Folder naming from the album artist property. But the Album artist should be the pirmary artist if you want other tools like lidarr to manage this better.

It also creates a bunch of extra artist folders.
Possible option would be to use the FtinTitle plugin and mirror to FtinAlbumArtist

tag artist release

dicolanl avatar Mar 05 '19 16:03 dicolanl

Interesting! So would you want a new field called something like $primary_artist that only includes the name of the first artist on a track? Even better would be for $artist itself to be treated as a list so you can just index the first one if you want, but that's a pretty complicated change.

sampsyo avatar Mar 05 '19 16:03 sampsyo

I see two options.

  1. add a primary artist field, which could then be used by your folder naming config. Pro: simple. Con: other tools like media monkey, lidarr, etc wont understand this field (https://help.mp3tag.de/main_tags.html)

  2. allow a config that sets the albumartist field based the MBid artist. pro: other tools would understand this. con: harder to implement. I think i saw antother issue indicating Artist MBid should support multiple artists like {artistid}/{artistid}. The code i assume would pick the 1st.

dicolanl avatar Mar 05 '19 16:03 dicolanl

I think we would do the first option. You could then combine this with an orthogonal feature like #488 to get the behavior you want where this affects the "real" artist field.

However, I have to admit that I'm not wild about the idea of a $primary_artist field, because it implies that we'd also want a $secondary_artist and $tertiary_artist—where will the madness end?? Not to mention $albumartist versions, and then also $artist_credit and $albumartist_credit… finding some sort of a clearer way to access artists as "lists" would be much nicer if we can figure it out.

sampsyo avatar Mar 06 '19 01:03 sampsyo

For the 1st option, I think it would be better to implement #1824 (new multi-value artist tag) and use the inline plugin to get the first artist (it's my manual workflow, using the default artists tag)

GuilhermeHideki avatar Apr 30 '19 20:04 GuilhermeHideki

I've been asked to post this here.

There is a plugin called AlbumArtist Extension for musicbrainz picard that allows different formats for the albumartist field. In particular _aaeStdPrimaryAlbumArtist or _aaeCredPrimaryAlbumArtist so the first (primary) album artist would only be used instead of all credited artists. It would be very useful to avoid having a folder structure like this:

Image 1558005638

Ideally they would all fall under Xavier Rudd

tehniemer avatar May 16 '19 18:05 tehniemer

As I understand it the core of the question is about directory names.

I wanted to do something similar, mainly to keep the number of artist directories in check. As no out of the box solution was available I resorted back to the inline plugin.

As described above, the MBID for the album is correct

Here is part of my config:

...
paths:
    default: %the{$primary_album_artist}/$original_year-$original_month-$original_day - $album%aunique{} [%upper{$albumtype}] (%upper{$format})/$disc-$track - $artist - $title ($bitrate)

item_fields:
    primary_album_artist: |
        import json
        import requests
        query_url="https://musicbrainz.org/ws/2/artist/{0}?inc=aliases&fmt=json".format(mb_albumartistid)
        r = requests.get(query_url)
        return r.json()["sort-name"]
...

Example:

As an example take the artist Aiobahn

Almost all of his tracks are some kind of collaboration (e.g. Aiobahn feat. nayuta)

The inline function looks at the track MBID album artist and pulls the sort-name.

This results in all of his singles/albums/etc. being placed in a single folder, while keeping the metadata intact. 🎉🎉🎉

I hope this will help someone in the future 😸

ESaller avatar Oct 21 '19 18:10 ESaller

Hi @ESaller, I am very interested in your inline snippet for this!

I have tried pasting your item_fields: block in my config.yaml file as-is, and referencing $primary_album_artist in the following paths:

singleton: $primary_album_artist/00 - Loose Tracks/$artist - $title
default: $primary_album_artist/$original_year - $album [${format}]/$track $title

When I run a beet move to apply the change, I get a bunch of error output which concludes in the following:

  • beetsplug.inline.InlineError: error in inline path field code:
  • Here it quotes the whole "primary_album_artist" code block.
  • KeyError: 'sort-name'

Keeping in mind that I know a very little bit of python, do you have any suggestion where I should start looking for why this is not working?

scruloose avatar Oct 24 '19 22:10 scruloose

I've done a bit of playing around with this, and I encountered two difficulties with @ESaller's snippet as-is:

  • If the lookup fails, beets crashes.
  • Singletons don't have an associated mb_albumartistid, so they need different handling.

Since my use case includes trying to salvage a lot of mp3s with incorrect/missing tags, many of which are singletons, I have tried to address both of those difficulties by expanding his snippet to this:

  item_fields:
      primary_album_artist: |
          import json
          import requests
          query_url="https://musicbrainz.org/ws/2/artist/{0}?inc=aliases&fmt=json".format(mb_albumartistid)
          try:
              primary_albumartist= requests.get(query_url).json()["sort-name"]
          except:
              if albumartist_sort:
                  print("MusicBrainz lookup for primary albumartist failed! Falling back on local sort-name.")
                  return albumartist_sort
              else:
                  print("MusicBrainz lookup for primary albumartist failed, and local sort-name not found! Using local artist name as-is.")
                  return albumartist
          else:
              return primary_albumartist

      primary_artist: |
          import json
          import requests
          query_url="https://musicbrainz.org/ws/2/artist/{0}?inc=aliases&fmt=json".format(mb_artistid)
          try:
              primary_artist= requests.get(query_url).json()["sort-name"]
          except:
              if artist_sort:
                  print("MusicBrainz lookup for primary artist failed! Falling back on local sort-name.")
                  return artist_sort
              else:
                  print("MusicBrainz lookup for primary artist failed, and local sort-name not found! Using local artist name as-is.")
                  return artist
          else:
              return primary_artist

It takes the same essential idea, but adds a non-album-dependent version for singletons, and if the lookup fails, it falls back on the local database's sort-name if available, and the local artist name if all else fails. Unless I've messed up (which is entirely possible) this should never cause beets to crash, always use the best available source, and at worst leave things in the same state it found them.

It's kind of a big mess, it's essentially untested, and I couldn't figure out a way to do it without repeating the whole function, but there it is in case it's helpful to anyone!

scruloose avatar Oct 25 '19 16:10 scruloose

Cool! If anyone's interested, a fun next step might be to package this code up as a plugin.

sampsyo avatar Oct 25 '19 20:10 sampsyo

That would be cool! I suspect I'd be in over my head with that myself, but I'll do some reading and see if it seems at all realistic. In the meantime, I have a warning for anyone using this snippet, and an idea for anyone who wants to turn it into a plugin.

Warning: this snippet as written simply hammers the MB server, making a request for every single track, which will cause failed requests as their flood-protection kicks in. As a crude rate-limiting workaround, add import time to the top of both chunks, and time.sleep(1) in the try blocks right after the query.

Idea: While I don't think I can do this in the context of an 'Inline' function, I think ideally we would also want to prevent unnecessary repetitive requests by defining a dictionary that lives for the duration of one whole beet import run, and then:

  • just before returning the name retrieved from MusicBrainz, add the ID — name pair to the dictionary
  • before querying MusicBrainz, check the ID against the local dictionary and retrieve the name from there instead if we've already got it.

scruloose avatar Oct 26 '19 01:10 scruloose

@scruloose Thanks for the updates. Did not think about singeltons as I have none currently. I also have noticed the timeouts for the API and added a sleep there.

The API "abuse" is something I am not really happy with. I will probably look into more closely this or next weekend.

Idea: I am not familiar with architecture just yet. Either beets is already requesting the data somewhere and we can extend that functionality to include additional field information as path parameter OR package it in a plugin and add all other fields from the request.

ESaller avatar Oct 28 '19 08:10 ESaller

@ESaller And thank you for the initial snippet. I would never have thought to query MB with the artist_id.

I am playing around with trying to make a plugin out of this... so far all I've made is a mess, but whether it turns into something usable or not, I'm learning lots. ☺

Other architecture questions:

  • Can a plugin send queries through beets' main network thread? If so, that would seem like a way more elegant solution if we do need to get stuff from the MB database. (It would prevent different components hitting MB at unpredictable times and maybe tripping the rate-limiting, and I assume the main network thread already has some sort of backoff-retry implementation.)
  • If we refer to the beets value artist_sort (or albumartist_sort) in a plugin, do we get the automatic fallback to artist (or albumartist), and vice-versa, or would the plugin have to implement that itself?

I don't have nearly enough knowledge to figure these out from reading the beets source code.

scruloose avatar Oct 28 '19 18:10 scruloose

I have a very rough prototype of a plugin that adds path variables for primary_artist, primary_artist_sort, primary_albumartist, and primary_albumartist_sort. I'm pretty sure it's terrible Python, but I got it to load and run on my beets install without burning the house down, which I thought was pretty exciting.

I switched from using the requests and json libraries to musicbrainzngs (built-in rate limiting!), added some logic to get the 'Artist name' alias that corresponds to the user's language pref, and added caching of IDs that have already been looked up. I'm sure @sampsyo is right that ultimately, the best solution would be for beets to store the artists in a collaboration as a list, if that can be figured out, but hopefully this is better than nothing for now.

One thing I know needs to be fixed is to use proper logging, instead of all the print statements scattered through the thing right now. I still haven't figured out whether there's some sort of beets-mbngs layer I should be calling instead of calling musicbrainzngs directly.

If anyone wants to pick that up and make something worthwhile out of it, please do! I'm also open to comments and pointers. I took a look at @GuilhermeHideki's WIP but only have a vague idea how it works, so my apologies if I'm just repeating what's already been done. I did totally copy the regex for sanity-checking MB ids from there, though!

scruloose avatar Nov 04 '19 00:11 scruloose

i solved this with item_fields: first_artist: albumartist.split(', ',1)[0:1][0]

because, almost every time, the first artist in the albumartist field is the "primary" artist.

rubiodamian avatar Apr 14 '20 22:04 rubiodamian

@rubiodamian: Cool! I didn't use string splitting because I was anticipating a nightmare of having to match 'and', '&', 'featuring', 'feat.', 'ft.', 'with', etc, and then deal with edge cases like "Belle and Sebastian", which should not be truncated to "Belle", and all that sort of mess. If just splitting on the comma works consistently, I assume that means some function somewhere along the way has already done all that hard string-mangling work, which obviously would simplify the problem a whole lot.

scruloose avatar Apr 15 '20 00:04 scruloose

@rubiodamian did you just put that in the config file?

dicolanl avatar Apr 15 '20 11:04 dicolanl

@scruloose i think that this split works almost every time, because i saw that the albumartist is always names separated by comma. but i don't know if that is always the case. Btw, this config just add the "first_artist" field to the beets flow, but does not modify the album_artist or artist field on the library or the audio files. But i think that once this "first_artist" is part of beets maybe you can use it for modifications o edition of metadata, but i didn't try that .

@dicolanl yes, you need the inline plugin for that working on the config file

rubiodamian avatar Apr 15 '20 16:04 rubiodamian

Thanks @rubiodamian for the suggestion, I am using your solution myself. However, it does not address the issue highlighted by @tehniemer, which is: how to organise the folders then? Any suggestions?

alfureu avatar May 21 '20 16:05 alfureu

Can be used with paths, like this:

item_fields:
    first_artist: albumartist.split(', ',1)[0:1][0]
paths:
    default: %lower{$first_artist}/$album%aunique{}/$track $title   

rubiodamian avatar May 21 '20 17:05 rubiodamian

Yes, as mentioned, I do use it in this way, however, I am still getting folder like:

image

alfureu avatar May 21 '20 17:05 alfureu

maybe that is becouse the albumartist of your files is not separated by commas and is just an string with all name together

rubiodamian avatar May 21 '20 17:05 rubiodamian

Well, indeed, there should be a better way of getting this done. I am thinking whether a pattern where some special characters like &, /, etc. sitting between two spaces (e.g. _#_) should be included in the item_fields listed above. This would pick Sting / The Police but not AC/DC

alfureu avatar May 21 '20 18:05 alfureu

BTW, I found an exception for @rubiodamian's rule: Earth, Wind & Fire

alfureu avatar May 22 '20 19:05 alfureu

I'm starting my beets configuration file and came across ftintitle plugin but I face similar issues. I use albumartist to group by main artist. If I use the artist field, my albums get split into multiple folders. I'm not familiar enough with python to start messing around but I like @ESaller solution - if we can prevent MB api abuse and avoid beet crashing if the field is empty or the api doesnt answer etc.

The prototype from @scruloose would be great (or a real plugin) if someone have time eventually. Right now I'll just leave it as is and endure with the mess :laughing:

Extarys avatar Jun 06 '20 02:06 Extarys

Well, i am trying to get around this issue myself, and I recognise that the topic is a bit more complicated.

Firstly, there is indeed the issue of

  • when the first albumartist is not separated by comma
  • when the albumartist (even the first artist) has a comma, &, or other 'unusual' character in their names (e.g. Sting / The Police, AC/DC, Earth, Wind & Fire)

Secondly, there is an issue of personal database first_artist order. This depends on the user's own preferences, e.g.

  • somebody is having a large database of Sting and prefers all his work in that subdirectory
  • while others collect more the works of The Police and would like to have the album categorised under that. This is something that stems from the albumartist field Sting / The Police

For the first issue, I wonder, whether a database of all potential artists/albumartists should be downloaded from the MusicBrainz servers. It should be implemented as a first step of some beets plugin, and should be doable, as one is able to run a self-hosted MusicBrainz server locally. Such a local database would enable to query albumartist for first_artist fields and eventually correct any issues (e.g. Earth, Wind & Fire).

For the second issue, when running beet import, the plugin should offer an interactive choice to the user, where to categorise a first_artist that is being extracted from albumartist, following the example of Sting / The Police, it should offer a choice of

Sting / The Police - The Very Best of Sting & The Police URL: https://musicbrainz.org/release-group/27eed66f-c7ca-399b-a418-f8eebd1e0efa Please select under which folder to move the album:

  1. Sting
  2. The Police
  3. Compilations
  4. Use as is

beet@beet:/#_

alfureu avatar Jun 06 '20 10:06 alfureu

When I had the same @DOFfactory questions, I created a workaround for myself this way:

  1. Create a plugin using the musicbrainzngs library, so I could:
    • Fetch the artists/albumartists as an array
    • Override the way that beets joins them (Solves the case for Earth, Wind & Fire and a lot of other cases on my library, like sertanejo duos)
  2. When I have a problem like the Sting / The Police example, I usually sidestep with 2 things:
    • Use the rewrite plugin to move the file where I want
    • Use some kind of "smart playlist" to join then automatically

GuilhermeHideki avatar Jun 06 '20 20:06 GuilhermeHideki

@GuilhermeHideki how do you create a beets plugin? Could you please elaborate?

alfureu avatar Jun 07 '20 07:06 alfureu

@DOFfactory Sorry for the delay.

I created https://gist.github.com/GuilhermeHideki/baeb63919fd51b23d168b700f374d875 and put in beetsplugs/artists.py (not the best way to install, but for me, it's ok). It depends on python 3.6

ps: I opened https://github.com/beetbox/beets/pull/3415 but didn't updated there because I wanted fix possible problems in the code before commiting and running the CI

GuilhermeHideki avatar Jun 15 '20 01:06 GuilhermeHideki

i found this https://github.com/rafaelparente/beets-lidarr-fields not yet tested but according to the readme it's a beet's plugin coded by @rafaelparente who can captured the first artist of the mains artists if there are multiple.

enjoy

jee-r avatar Jul 28 '20 08:07 jee-r

@jee-r good find.

dicolanl avatar Jul 28 '20 13:07 dicolanl