Album match percentage
When importing using beet import --quiet downloads this album got skiped, but when I run it without quiet, the only thing it complains is != tracks, id and year, but it doesn't show where exactly is the problem, how can I fix it to make these imports match automatically without my interaction?
Problem
Running this command in verbose (-vv) mode:
beet -vv import downloads
beet -vv import downloads
no user configuration found at ...
data directory: ...
plugin paths: []
Loading plugins: artistcountry, badfiles, bandcamp, deezer, discogs, duplicates, fetchart, inline, lastgenre, loadext, lyrics, plexupdate, scrub, spotify
lastgenre: Loading whitelist ...
fetchart: lastfm: Disabling art source due to missing key
fetchart: google: Disabling art source due to missing key
inline: adding item field initial
Sending event: pluginload
library database: /mnt/the_path/beets.db
library directory: /mnt/the_path/Music Library
Sending event: library_opened
loadext: loading extension ...
Sending event: import_begin
Import of the directory:
/mnt/the_path/downloads
was interrupted. Resume (Y/n)?
Resuming interrupted import of /mnt/the_path/downloads
Sending event: import_task_created
Sending event: import_task_start
Looking up: /mnt/the_path/downloads/[2017] The Circle
Tagging Heretoir - The Circle
Searching for discovered album ID: ab7f7076-8b41-45a3-aaee-45e61638a954
discogs: Searching for release ab7f7076-8b41-45a3-aaee-45e61638a954
bandcamp: Not a bandcamp URL, skipping
Search terms: Heretoir - The Circle
Album might be VA: False
discogs: Getting master release 245053
Sending event: albuminfo_received
Candidate: The Perfect Circle - The Perfect Circle (7408302)
Computing track assignment...
...done.
Success. Distance: 0.67
discogs: Getting master release 120029
Sending event: albuminfo_received
Candidate: Nitty Gritty Dirt Band Featuring Maybelle Carter , Earl Scruggs , Doc Watson , Roy Acuff , Merle Travis , Jimmy Martin Also Featuring Vassar Clements , Junior Huskey , Norman Blake , Pete 'Oswald' Kirby - Will The Circle Be Unbroken (1600516)
Computing track assignment...
...done.
Success. Distance: 0.82
discogs: Getting master release 54377
Sending event: albuminfo_received
Candidate: Wipers - The Circle (2238474)
Computing track assignment...
...done.
Success. Distance: 0.67
Sending event: albuminfo_received
Candidate: Nitty Gritty Dirt Band - Will The Circle Be Unbroken (8398653)
Computing track assignment...
...done.
Success. Distance: 0.81
discogs: Getting master release 94275
Sending event: albuminfo_received
Candidate: Tomcraft - The Circle (39668)
Computing track assignment...
...done.
Success. Distance: 0.70
bandcamp: Searching releases of type 'a' for query 'Heretoir - The Circle' using '{'query': 'Heretoir - The Circle', 'artist': 'Heretoir', 'label': '', 'search_type': 'a'}'
Sending event: albuminfo_received
Candidate: Heretoir - The Circle (https://heretoir.bandcamp.com/album/the-circle)
Computing track assignment...
...done.
Success. Distance: 0.54
Sending event: albuminfo_received
Candidate: Heretoir - the circle (https://yehonalatapes.bandcamp.com/album/the-circle)
Computing track assignment...
...done.
Success. Distance: 0.54
Sending event: albuminfo_received
Candidate: Heretoir - the circle (https://yehonalatapes.bandcamp.com/album/the-circle#p1495220632)
Computing track assignment...
...done.
Success. Distance: 0.57
spotify: Searching Spotify for 'album:"The Circle" artist:"Heretoir"'
spotify: Found 1 result(s) from Spotify for 'album:"The Circle" artist:"Heretoir"'
Sending event: albuminfo_received
Candidate: Heretoir - The Circle (6HG0JqHWi5l2PFFMHBXaTN)
Computing track assignment...
...done.
Success. Distance: 0.40
deezer: Searching Deezer for 'album:"The Circle" artist:"Heretoir"'
deezer: Found 1 result(s) from Deezer for 'album:"The Circle" artist:"Heretoir"'
Sending event: albuminfo_received
Candidate: Heretoir - The Circle (15277785)
Computing track assignment...
...done.
Success. Distance: 0.40
Evaluating 10 candidates.
/mnt/the_path/downloads/[2017] The Circle (11 items)
Sending event: import_task_before_choice
Sending event: before_choose_candidate
Finding tags for album"Heretoir - The Circle".
Candidates:
1. (60.0%) Heretoir - The Circle
≠ tracks, id, year
Spotify, None, 2017, None, Northern Silence Productions, None, None
2. (60.0%) Heretoir - The Circle
≠ tracks, id, year
Deezer, None, 2017, None, Northern Silence Productions, None, None
3. (45.9%) Heretoir - The Circle
≠ tracks, id, source, ...
bandcamp, Digital Media, 2017, DE, Heretoir, , None
4. (45.9%) Heretoir - the circle
≠ tracks, id, source, ...
bandcamp, Digital Media, 2017, DE, yehonala tapes, , None
5. (43.2%) Heretoir - the circle
≠ tracks, id, source, ...
bandcamp, Cassette, 2017, DE, yehonala tapes, , None
6. (33.1%) Wipers - The Circle
≠ tracks, id, artist, ...
Discogs, Vinyl, 1988, US, Restless Records, 7 72339-1, None
7. (32.7%) The Perfect Circle - The Perfect Circle
≠ tracks, id, artist, ...
Discogs, Vinyl, 1977, US, Inner City Records (3), ICR 114272, None
8. (30.4%) Tomcraft - The Circle
≠ unmatched tracks, id, artist, ...
Discogs, 2xVinyl, 1997, Germany, Kosmo Records, KOS 013, None
9. (18.5%) Nitty Gritty Dirt Band - Will The Circle Be Unbroken
≠ missing tracks, tracks, id, ...
Discogs, 3xVinyl, 1972, US, United Artists Records, UAS-9801, None
10. (18.4%) Nitty Gritty Dirt Band Featuring Maybelle Carter , Earl Scruggs , Doc Watson , Roy Acuff , Merle Travis , Jimmy Martin Also Featuring Vassar Clements , Junior Huskey , Norman Blake , Pete 'Oswald' Kirby - Will The Circle Be Unbroken
≠ missing tracks, tracks, id, ...
Discogs, 3xVinyl, 1975, US, United Artists Records, UAS 9801, None
➜ # selection (default 1), Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort?
Led to this problem:
exiftool 01\ Alpha.mp3
ExifTool Version Number : 12.76
File Name : 01 Alpha.mp3
Directory : .
File Size : 4.3 MB
File Modification Date/Time : 2025:10:13 09:34:16-04:00
File Access Date/Time : 2025:10:13 09:42:27-04:00
File Inode Change Date/Time : 2025:10:13 09:34:16-04:00
File Permissions : -rw-r--r--
File Type : MP3
File Type Extension : mp3
MIME Type : audio/mpeg
MPEG Audio Version : 1
Audio Layer : 3
Audio Bitrate : 320 kbps
Sample Rate : 44100
Channel Mode : Joint Stereo
MS Stereo : On
Intensity Stereo : Off
Copyright Flag : False
Original Media : True
Emphasis : None
Encoder : LAME3.99r
Lame VBR Quality : 4
Lame Quality : 3
Lame Method : CBR
Lame Low Pass Filter : 20.5 kHz
Lame Bitrate : 255 kbps
Lame Stereo Mode : Joint Stereo
ID3 Size : 904912
Media : Digital Media
Album : The Circle
Artist : Heretoir
Encoder Settings : Lame3.99
Genre : Blackgaze/Post-Metal/Atmospheric Black Metal/DSBM
Title : Alpha
Track : 1/11
Year : 2017
Band : Heretoir
Original Release Year : 2017
User Defined Text : (replaygain_album_peak) 1.023807
Picture MIME Type : image/jpeg
Picture Type : Front Cover
Picture Description :
Picture : (Binary data 902236 bytes, use -b option to extract)
Comment :
Date/Time Original : 2017
Duration : 0:01:24 (approx)
id3v2 -l 01\ Alpha.mp3
id3v1 tag info for 01 Alpha.mp3:
Title : Alpha Artist: Heretoir
Album : The Circle Year: 2017, Genre: Other (12)
Comment: Track: 0
id3v2 tag info for 01 Alpha.mp3:
UFID (Unique file identifier): http://musicbrainz.org, 36 bytes
TXXX (User defined text information): (MusicBrainz Album Release Country): XW
TXXX (User defined text information): (MusicBrainz Album Status): official
TXXX (User defined text information): (MusicBrainz Album Type): album
TXXX (User defined text information): (MusicBrainz Album Id): ab7f7076-8b41-45a3-aaee-45e61638a954
TXXX (User defined text information): (MusicBrainz Artist Id): bdc04329-8622-41e1-9aef-7e398905466a
TXXX (User defined text information): (MusicBrainz Album Artist Id): bdc04329-8622-41e1-9aef-7e398905466a
TXXX (User defined text information): (MusicBrainz Release Group Id): 2f7dc164-facf-4536-af92-0352c257e0bc
TMED (Media type): Digital Media
TXXX (User defined text information): (MusicBrainz Release Track Id): 3ee02690-520f-40a8-adf5-fb659beba499
TALB (Album/Movie/Show title): The Circle
TPE1 (Lead performer(s)/Soloist(s)): Heretoir
TSSE (Software/Hardware and settings used for encoding): Lame3.99
TCON (Content type): Blackgaze (255)
TIT2 (Title/songname/content description): Alpha
TRCK (Track number/Position in set): 1/11
TYER (Year): 2017
TPE2 (Band/orchestra/accompaniment): Heretoir
TORY (Original release year): 2017
TXXX (User defined text information): (replaygain_track_gain): -3.99 dB
TXXX (User defined text information): (replaygain_track_peak): 0.977073
TXXX (User defined text information): (replaygain_album_gain): -8.94 dB
TXXX (User defined text information): (replaygain_album_peak): 1.023807
APIC (Attached picture): ()[, 3]: image/jpeg, 902236 bytes
Here's a link to the music files that trigger the bug (if relevant):
Setup
- OS: Ubuntu
- Python version: 3.12.3
- beets version: 2.4.0
- Turning off plugins made problem go away (yes/no):
My configuration (output of beet config) is:
beet config
disabled_plugins: []
lyrics:
auto: yes
translate:
api_key: REDACTED
from_languages: []
to_language:
dist_thresh: 0.11
google_API_key: REDACTED
google_engine_ID: REDACTED
genius_api_key: REDACTED
fallback:
force: no
local: no
print: no
synced: no
sources:
- lrclib
- google
- genius
- tekstowo
lastgenre:
whitelist: yes
min_weight: 10
count: 1
fallback:
canonical: no
source: album
force: no
keep_existing: no
auto: yes
separator: ', '
prefer_specific: no
title_case: yes
extended_debug: no
fetchart:
auto: yes
minwidth: 0
maxwidth: 0
quality: 0
max_filesize: 0
enforce_ratio: no
cautious: no
cover_names:
- cover
- front
- art
- album
- folder
sources:
- filesystem
- coverart
- itunes
- amazon
- albumart
- cover_art_url
store_source: no
high_resolution: no
deinterlace: no
cover_format:
google_key: REDACTED
google_engine: REDACTED
fanarttv_key: REDACTED
lastfm_key: REDACTED
discogs:
search_limit: 5
source_weight: 0.5
apikey: REDACTED
apisecret: REDACTED
tokenfile: discogs_token.json
user_token: REDACTED
separator: ', '
index_tracks: no
append_style_genre: no
bandcamp:
include_digital_only_tracks: yes
search_max: 2
art: no
exclude_extra_fields: []
genre:
capitalize: no
maximum: 0
mode: progressive
always_include: []
comments_separator: '
---
'
truncate_comments: no
spotify:
search_limit: 5
source_weight: 0.5
search_query_ascii: no
mode: list
tiebreak: popularity
show_failures: no
artist_field: albumartist
album_field: album
track_field: title
region_filter:
regex: []
client_id: REDACTED
client_secret: REDACTED
tokenfile: spotify_token.json
deezer:
search_limit: 5
source_weight: 0.5
search_query_ascii: no
duplicates:
album: no
checksum: ''
copy: ''
count: no
delete: no
format: ''
full: no
keys: []
merge: no
move: ''
path: no
tiebreak: {}
strict: no
tag: ''
remove: no
scrub:
auto: yes
plex:
host: localhost
port: 32400
token: REDACTED
library_name: Music
secure: no
ignore_cert_errors: no
pathfields: {}
item_fields: {}
album_fields: {}
Could you give the current 2.5.0 version a try?
I can confirm 2.5.0 still won't match albums even when I just tagged them using Picard with a perfect match. Beets computes a match score around 93% and won't autotag.
I don't have logs at hand, but here my config, nothing changed about matching I think:
directory: /srv/sync/Audio/Musique
library: /srv/sync/Audio/musiclibrary.db
plugins: musicbrainz replaygain chroma lyrics fetchart embedart thumbnails missing spotify permissions web lastgenre scrub info discogs convert autobpm
incremental: yes
resume: ask
import:
languages: fr en
duplicate_verbose_prompt: yes
replaygain:
backend: ffmpeg
auto: no
chroma:
auto: no
lyrics:
auto: no
synced: yes
sources: [lrclib]
web:
host: 0.0.0.0
permissions:
file: 664
dir: 775
lastgenre:
force: yes
keep_existing: no
source: track
fallback: ''
paths:
default: $albumartist/$original_year - $album%aunique{albumartist album original_year albumtype,label catalognum albumdisambig releasegroupdisambig}/$track $title
singleton: Non-Album/$artist/$title
comp: Compilations/$original_year - $album%aunique{album original_year,label catalognum albumdisambig releasegroupdisambig}/$track $artist - $title
^albumtype:album: $albumartist/$original_year - $album%aunique{albumartist album original_year albumtype,label catalognum albumdisambig releasegroupdisambig} [$albumtype]/$track $title
convert:
command: ffmpeg -i $source -y -vn -acodec libopus -ab 320k $dest
extension: opus
autobpm:
auto: no
Can you try to disable the penalization?
https://beets.readthedocs.io/en/stable/plugins/index.html#using-metadata-source-plugins
Adding
match:
distance_weights:
data_source: 0.0 # Disable data source matching
To the config indeed makes the match 100% in the case I tested (an album already matched against MusicBrainz and tagged by beets my library). Shouldn't it be the case by default, though?
I agree. The whole data_source_mismatch_penalty is confusing for users.
@snejus Do you think we can change the default to 0?
I don't think it's really the issue, as it's seemingly triggering the penalty even with albums that were previously matched by beets or Picard, so they should match the data source at this point? Or am I understanding this wrong?
As far as I understand it triggers with the following conditions (which is the issue here imo):
- The penalty is applied if
- you have two metadata plugins defined during any import (using candidate source)
- you reimport/update with a different source
- It is not applied if
- only one metadata plugin is enabled during import
Further it is defined as an additive distance in the calculation so you will never see 100% matches if the penalty triggers (which is a strange choice imo but might be useful for some).
The data source which is used for this penalty is only stored in the beets database and not in the file metadata. Beets does not know the metadata came from a specific source during initial imports.
Heres the example I tried, without disabling the data_source mismatch penalty:
$ beet info -a album:="Jam Session" -i data_source,path
/srv/sync/Audio/Musique/La Jam/1999 - Jam Session [ep]
data_source: MusicBrainz
$ beet import /srv/sync/Audio/Musique/La\ Jam/1999\ -\ Jam\ Session\ \[ep\]
/srv/sync/Audio/Musique/La Jam/1999 - Jam Session [ep] (6 items)
Match (93.5%):
La Jam - Jam Session
≠ data source, tracks
MusicBrainz, CD, 1999, FR, None, None, None
https://musicbrainz.org/release/38cad1d3-accc-4347-9ce2-661841d87f1c
* Artist: La Jam
* Album: Jam Session
➜ [A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort? B
Album is already in beets' database, data_source is already MusicBrainz, shouldn't it match even with the penalty enabled in that case?
For other cases, If data_source is considered different when non-existent (as when importing new items, since this property doesn't exist outside beets), this kills the autotagger feature, literally.
data_source is already MusicBrainz
What do you mean? I'm pretty sure data_source is not stored in the files. Import only considers metadata from the files
Yeah, mistakenly thought beets was detecting already in-library items as it does some magic about that according to https://beets.readthedocs.io/en/stable/reference/cli.html#reimporting
Doing beet import -L album:="Jam Session" indeed matched the album correctly, since it's importing from the database and not from the files.
Thanks for the insights, I don't know if this will be considered a bug as such, but at least I got a solution for my everyday use.
Yeah that what I meant for the reimport, if you use -L it should work.
Still quite confusing if you ask me since you can't get perfect matches in the default setting i.e. if you enable more than one metadata source. This is also what happened in this issue.
My guess is that this penalty should be discarded entirely on importing new items, makes no sense to mismatch void, and trigger only on re-import when data_source is indeed mismatching.
I'm using 2.5.1, still getting these matchs.
There is a config for case sensitivity penalty? and maybe for a list of symbols, I would like to give it a lower weight
@talski I think you could get above the autotag threshold by disabling the data source mismatch penalty too, using the config snippet in https://github.com/beetbox/beets/issues/6096#issuecomment-3405913337
@talski I think you could get above the autotag threshold by disabling the data source mismatch penalty too, using the config snippet in #6096 (comment)
using it right now, it has improved a lot, thank you
This option primarily exists to provide users with an ability to 'prefer' certain data sources over others. The default 0.5 value has the following effect:
- New imports: all data sources are slightly penalised because the original data source is unknown.
- Reimports: it exporess a bias towards the current/original data source, penalising others.
Setting the default to 0.0 means the data source is ignored altogether:
- New imports: 100% matches are possible
- Reimports: no bias towards the current data source where matches from other sources are not penalised
I considered hardcoding this configuration to 0.0 in the code for new imports only. However, this breaks configurations for those users that do have preferences configured - those who expect to see candidates from their favourite data source at the top of the list for new music imports.
The ideal solution might be a "preferred data source" setting which would take a list of data sources in order of preference like the others "preferred" options (https://beets.readthedocs.io/en/stable/reference/config.html#preferred).
Then, on import:
- If this setting is not set, no penalty is applied
- If it's set, apply no penalty for first source 1% penalty for second source, n+1% penalty for source n, etc. All enabled sources not in the "preferred" list would get the same [numbers of items in the preferred list}+1% penalty
A boolean "prefer existing source" setting could be added specifically to add a fixed penalty on re-imports on every sources except the existing one. EDIT: Or it could just put the current data source on top of the list (no penalty), and work like on first imports for other sources.
That's not playing nice with existing configs either, but it would seem much clearer to me this way. I clearly prefer MusicBrainz as source whenever possible, but I never used those settings because they look confusing to me. There could be a "deprecated config options" alert on each beet invocation which would refer to the relevant doc for users with the current config options.