LyricsGenius icon indicating copy to clipboard operation
LyricsGenius copied to clipboard

Lyrics appear to contain a bit of garbage data?

Open Gazoo101 opened this issue 2 years ago β€’ 4 comments

Returned Lyrics contain some garbage data, I'd assume due to a change in formatting on https://genius.com 's webpage?

All lyrics (or at least the 7 I tested) appear to lead with the following:

"<song name> Lyrics", e.g. in the case of FreeBird, it'd be "FreeBird Lyrics"

and end with "<number>Embed" or "Embed" at the end.

I'd say these pieces aren't supposed to be part of the lyrics, yes?

Version info

  • Package version [3.0.1]
  • OS: ubuntu 18.04

Gazoo101 avatar Apr 23 '22 03:04 Gazoo101

I've found the cause, in the lyrics() method in genius.py, the div that is searched is the one with class_=re.compile("^lyrics$|Lyrics__Root" however this also returns the number of "Pyongs" and the Embed button from the Lyrics_Footer div, the text content of which are included in Lyrics_Root.

EDIT: I've seen this solved in https://github.com/johnwmillr/LyricsGenius/pull/215#issuecomment-1083670536

Acervans avatar Sep 11 '22 14:09 Acervans

I've found the cause, in the lyrics() method in genius.py, the div that is searched is the one with class_=re.compile("^lyrics$|Lyrics__Root" however this also returns the number of "Pyongs" and the Embed button from the Lyrics_Footer div, the text content of which are included in Lyrics_Root.

EDIT: I've seen this solved in https://github.com/johnwmillr/LyricsGenius/pull/215#issuecomment-1083670536

Hi. Your link seem to point to nowhere. Do you have the fix for this bug? I am bit too fresh with Python to start fiddling with the code myself.

roaldandresen avatar Jan 18 '23 21:01 roaldandresen

The PR is available at https://github.com/johnwmillr/LyricsGenius/pull/215 If you can't or don't know how to merge this PR with your own fork. Just add this

I've found the cause, in the lyrics() method in genius.py, the div that is searched is the one with class_=re.compile("^lyrics$|Lyrics__Root" however this also returns the number of "Pyongs" and the Embed button from the Lyrics_Footer div, the text content of which are included in Lyrics_Root. EDIT: I've seen this solved in https://github.com/johnwmillr/LyricsGenius/pull/215#issuecomment-1083670536

Hi. Your link seem to point to nowhere. Do you have the fix for this bug? I am bit too fresh with Python to start fiddling with the code myself.

The PR is available at https://github.com/johnwmillr/LyricsGenius/pull/215 Until that PR is merged and the library updated, you could fork the repository and merge this PR with your own fork.

allerter avatar Jan 19 '23 15:01 allerter

Thank you!

roaldandresen avatar Jan 19 '23 16:01 roaldandresen