redlist
redlist copied to clipboard
Song matching with wrong artists
Issue
currently, redlist can match a similar song title from the beet library if it's coming from a totally different artist
Example
Let's say we're trying to match the following tune from spotify:
-
Vampire, Performed by Solence
If this tune/artist is missing from the beet library, but a similar title exists, redlist will match with the closest title available, for example:
-
Reggae/Peter Tosh/1987 - No Nuclear War/04 Vampire.mp3
Solution?
We probably should have in the redlist config along the restrict_album option, another one:
- restrict_artist
I believe that those two options could take a value between 0 and 1 rather than a true/false value. Some artists could be spell differently, especially when there is an artist featuring. By using a value between 0 and 1, some fuzzysearch could maybe be implemented, and this value would be the degree of confidence.
I've come across this issue before. Redlist uses beets' track distance logic to determine matches. In order to allow matches that come from different albums/sources (Eg: your Spotify playlist has "Tiny Dancer" from "Madman Across the Water" but you have the radio edit of "Tiny Dancer" from a "Best of Elton John" album) I had to tune the maximum accepted distance to something fairly liberal using fairly few fields to minimize unnecessary torrent downloading. I found 0.3
to be a happy medium, but beets uses its own weights to score how important certain differences are. In Beets, track title and length are more strongly weighted than artist or album.
All together that means that if you have a track with the same title and a very similar length it can outweigh the fact the artist and album are fairly different, pushing the difference score to just below 0.3
. Unfortunately there's a lot of nuance that goes into artist matching that simple fuzzy/levenshtein distance won't cover (Eg: stripping "feat" tags, scoring differences further down the string lower, "the" being scored lower so that "The Doors" and "Doors" are closer than "Hat Doors", ect.).
So there are really 2 ways to limit this:
1. Adjusting the match threshold
I've added a beets_match_threshold
option to the config so that you can override the 0.3
value I've tuned it to. If you set it lower to 0.2
or 0.25
it'll be more strict and you'll see less false matches, but you're also likely to miss some more possible matches as well.
2. Upping the Artist weight in beets
You can also adjust how important beets weights the Artist value. Adding this to your beets config.yaml
file will up it from the default of 2.0
to 3.0
.
match:
distance_weights:
track_artist: 3.0
You can see all the default weights here for reference but you should know that Redlist only uses track_length
, track_title
, and track_artist
to calculate matching distance, and only adds album
if the --restrict-album
option is set.
I'll leave this issue open incase you or anyone else has some other ideas on how to limit the problem since I agree it is annoying. I suppose I could add an additional penalty for artist mis-match; I'll have to think some more about it.
After thinking about it some more and running some tests I think the best thing to do is simply weight the track_artist
more heavily for the distance calculation. I've configured redlist to temporarily patch the weight to a higher number when doing it's matching.
I think that since there are on average 1-2 missed matches on a 100 song playlist, the trade of an extra 2-3 is worth it to reduce some of the more blatant false positive matches.
If you try out the new build, let me know how the new weights do on your playlists, I'm still trying to tune the new weights.
Sorry for the slow reply, was away for the last week! Cheers for looking into this, your explanations makes heaps more sense, and yeah a fzf approach is probably not enough. I'm trying the code with your latest changes and it seems to be way better tuned. I have yet to come across a death metal tune in my chilllout playlist :D! I'll keep on testing this, especially with some spotify playlists which are always adding very recent tunes since these will most likely not be in my beet library.