beets icon indicating copy to clipboard operation
beets copied to clipboard

lyrics: tekstowo backend does not check if the found song actually matches

Open valpackett opened this issue 3 years ago • 1 comments

Problem

Running this command in verbose (-vv) mode:

$ beet -vv lyrics -p 'kelly bailey'

Led to this problem:

lyrics: failed to fetch: https://www.musixmatch.com/lyrics/Kelly-Bailey/Black-Mesa-Inbound (404)
lyrics: Genius failed to find a matching artist for 'Kelly Bailey'
lyrics: got lyrics from backend: Tekstowo
lyrics: fetched lyrics: Kelly Bailey - Half-Life 2 - Black Mesa Inbound
Sending event: write
Sending event: after_write
Sending event: database_change
Black eyed Susan
Sun shines in your veins
If the clouds are moving
Never hear her complain
Yeah, black eyed Susan
Just waiting on a drop of rain
…

…there are no lyrics. It's an instrumental track.

The backend seems to blindly trust that the first row in search results would be the actual expected song:

https://github.com/beetbox/beets/blob/7c670711aeb2e2b3c9e1c02845c4ae96deaea82b/beetsplug/lyrics.py#L486-L494

But the site absolutely can return something with vaguely similar names!!!

image

Setup

  • OS: FreeBSD -CURRENT
  • Python version: 3.9.4
  • beets version: current git
  • Turning off plugins made problem go away (yes/no): no

valpackett avatar Jul 08 '22 22:07 valpackett

Ack, that's pretty bad! We should absolutely confirm that we have a match… these backends aren't really supposed to be "fuzzy" in this way.

sampsyo avatar Jul 09 '22 18:07 sampsyo

I have been looking into this issue and plan on contributing (first time). Would we want to only return lyrics on an exact match of the title and artist between the top search result and the search query? Or would we prefer a type of similarity score (like the similarity ratio used in the code for finding lyrics through Google search) rather than an exact match?

luharder avatar Nov 02 '22 07:11 luharder

Thank you for your interest! Yeah I think it should be a bit fuzzy, using difflib.SequenceMatcher like the Google backend does seems fine.

valpackett avatar Nov 02 '22 13:11 valpackett

We also have an existing utility, string_dist, for comparing the similarity of two strings: https://github.com/beetbox/beets/blob/e201dd4fe57b0aa2e80890dc3939b0a803e3448d/beets/autotag/hooks.py#L249

sampsyo avatar Nov 03 '22 14:11 sampsyo