mutagen icon indicating copy to clipboard operation
mutagen copied to clipboard

File "/usr/lib/python2.7/dist-packages/mutagen/flac.py", line 597, in write

Open WinEunuuchs2Unix opened this issue 5 years ago • 11 comments

To correct this error I had to change line 597 (line 618 in this master I believe) from:

desc = self.desc.encode('UTF-8')

To:

try:                                # 2020-10-18 UnidcodeDecodeError
    desc = self.desc.encode('UTF-8')
except UnicodeDecodeError:          # Filename: 06 Surf’s Up.oga
    desc = self.desc                # self.desc already in UTF-8

Note the song filename contained a right closing quote. Not the exaggerated right closing quote and not the conventional ASCII single quote.

WinEunuuchs2Unix avatar Oct 18 '20 14:10 WinEunuuchs2Unix

self.desc is always unicode, so I don't see how this can work.

(btw. Python 2 is no longer supported)

lazka avatar Oct 18 '20 14:10 lazka

I just started learning Python so cannot explain it that well. I think it is because its encoding UTF8 character that is already encoded.

On Sun, Oct 18, 2020, 8:53 AM Christoph Reiter, [email protected] wrote:

self.desc is always unicode, so I don't see how this can work.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quodlibet/mutagen/issues/499#issuecomment-711196123, or unsubscribe https://github.com/notifications/unsubscribe-auth/AICIBH55WVW4OVGXS3JTZA3SLL6QJANCNFSM4SVD5MZA .

WinEunuuchs2Unix avatar Oct 20 '20 12:10 WinEunuuchs2Unix

If you are new to using Python I really suggest you go with Python 3. There is really no good reason to start learning Python with version 2, you won't do yourself a favor.

Also to use the latest mutagen version you will need to use Python 3 as Python 2 is no longer supported.

phw avatar Oct 20 '20 12:10 phw

I just started learning Python so cannot explain it that well

No problem. Can you provide a the file you are working with and some code that we can run to get to the error you see?

lazka avatar Oct 20 '20 14:10 lazka

Yes most people would say the same thing. Especially since 2.7.12 is EOL. In Ubuntu 16.04.6 you can set environment to Python 3 and take the easy road. There is probably lots of legacy Python code 2 out there (indeed just as there is still COBOL out there) so the harder route seemed like a good learning tool that could come in handy if I ever have to maintain legacy code.

That said it doesn't help unravel the UTF-8 right closing single possessive apostrophe in song name passed to UTF-8 encoding method causing it to crash and removing double encoding allows it to succeed. The same code appears to be in Python 3 version of the master: flac.py.

Today's project is grabbing cover art from X11 clipboard for encoding onto songs (rather than just Musicbrainz's link to Cover Art Archive) but I'll try to revisit this bug later and provide more information.

Thanks for the tips though!

On Tue, Oct 20, 2020 at 6:52 AM Philipp Wolfer [email protected] wrote:

If you are new to using Python I really suggest you go with Python 3. There is really no good reason to start learning Python with version 2, you won't do yourself a favor.

Also to use the latest mutagen version you will need to use Python 3 as Python 2 is no longer supported.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quodlibet/mutagen/issues/499#issuecomment-712826231, or unsubscribe https://github.com/notifications/unsubscribe-auth/AICIBHY3KGL4NQKHC6UQ6ALSLWBXHANCNFSM4SVD5MZA .

WinEunuuchs2Unix avatar Oct 21 '20 00:10 WinEunuuchs2Unix

The file itself is from Jim Steinman's CD "Bad for Good" but you don't need to purchase the CD you can just pass the song #6:

Surf's Up.oga

Note the other mutagen functions have no problem opening the file name:

    if fmt == 'wav':
        encoding = 'wavenc'
    elif fmt == 'flac':
        encoding = 'flacenc'
        from mutagen.flac import FLAC as audiofile
    elif fmt == 'oga':
        quality = 'quality=' + str(float(self.qual_var.get()) / 100.0)
        encoding = 'vorbisenc {} ! oggmux'.format(quality)
        from mutagen.oggvorbis import OggVorbis as audiofile

Then:

    fmt = self.fmt_var.get()

    if fmt == 'flac':
        from mutagen.flac import FLAC as audiofile
    elif fmt == 'oga':
        from mutagen.oggvorbis import OggVorbis as audiofile
    else:
        print('Programmer ERROR: add_metadata_to_song() bad fmt=',fmt)
        return False

    try:
        # Maybe Kid3 has batch tagging functionality?
        audio = audiofile(self.os_full_name)
    except UnicodeDecodeError as err:
        print(err)
        print('add_metadata_to_song() ERROR mutagen.oggvorbis on file:')
        print(self.os_full_name)
        return False

    # Doesn't fix the error encoding with UTF-8
    # /home/rick/Music/Jim Steinman/Bad for Good/06 Surf’s Up.oga
    # 'ascii' codec can't decode byte 0xe2 in position 50: ordinal not

in range(128)

    #print('Tagging track {}'.format(self.os_song_name))
    audio['TRACKNUMBER'] = str(self.rip_current_track)
    # 'ARTIST' goes to 'ALBUMARTIST' in Kid3 and iTunes
    audio['ARTIST'] = self.selected_artist
    audio['ALBUMARTIST'] = self.selected_artist
    audio['ALBUM'] = self.selected_album
    audio['TITLE'] = self.song_name     # '99 -' and .ext stripped
    if self.selected_date is not None:
        audio['DATE'] = self.selected_date
    # What about Musicbrainz ID? It is auto added along with discid
    # Add comment "Encoded 2020-10-16 12:15, format: x, quality: y
    # Already has comment in 'file' command header

    #audio.save(v2_version=3)    # Version 4 causing problems?
    audio.save()                 # v2_version flag unknown

This all works perfectly! Although at first I thought it was suspect but I had too much could in the try: section.

Anyway this fails after I split the code separate from the working code:

    ''' FLAC and OGG have different methods. OGG has diff read/write

too. See: https://mutagen.readthedocs.io/en/latest/user/vcomment.html ''' import base64 from mutagen.oggvorbis import OggVorbis from mutagen.flac import Picture

    try:
        file_ = OggVorbis(self.os_full_name)
    except UnicodeDecodeError as err:
        print(err)
        print('add_image_to_oga() ERROR mutagen.oggvorbis on file:')
        print(self.os_full_name)
        return False

    picture = Picture()

    picture.data = self.image_data
    picture.type = 17               # Hex 11 - A bright colorful fish
    picture.type = 3                # Hex 03 - Cover (Front)
    # picture.desc = u"mserve - add_image_to_oga()"
    picture.desc = self.song_name
    picture.mime = u"image/jpeg"
    picture.width = 100             # Seems to have no effect?
    picture.height = 100
    picture.depth = 24

    try:
        picture_data = picture.write()
    except UnicodeDecodeError as err:
        #   File "/usr/lib/python2.7/dist-packages/mutagen/flac.py",
        # line 597, in write:  desc = self.desc.encode('UTF-8')

        print(err)
        print('add_image_to_oga() ERROR mutagen.flac.Picture.write on

file:') print(self.os_full_name) return False

    encoded_data = base64.b64encode(picture_data)
    vcomment_value = encoded_data.decode("ascii")

    file_["metadata_block_picture"] = [vcomment_value]
    file_.save()

The picture.write() fails with the error. But not on ASCII filenames, only on the one song filenames that contain a UTF-8 character. I've ripped three CD's so far without a problem. In my own code I found encoding UTF-8 twice led to errors which is how I quickly fixed the problem in flac.py.

I'd be happy to reverse my fix to reduplicate the error messages if you like.

Thanks for taking the time to look at this.

On Tue, Oct 20, 2020 at 8:53 AM Christoph Reiter [email protected] wrote:

I just started learning Python so cannot explain it that well

No problem. Can you provide a the file you are working with and some code that we can run to get to the error you see?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quodlibet/mutagen/issues/499#issuecomment-712911050, or unsubscribe https://github.com/notifications/unsubscribe-auth/AICIBH3OCGXOQFM3O4Z46KLSLWP4ZANCNFSM4SVD5MZA .

WinEunuuchs2Unix avatar Oct 21 '20 01:10 WinEunuuchs2Unix

picture.desc = self.song_name

I don't know where it's coming from, but self.song_name needs to be unicode. In fact, every text you pass to mutagen needs to be unicode.

lazka avatar Oct 21 '20 06:10 lazka

It already was encoded UTF-8. If I try to double encode I get the same error flac.py gets:

Traceback (most recent call last): File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 1540, in call return self.func(*args) File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 590, in callit func(*args) File "/home/rick/python/encoding.py", line 616, in cd_run_to_close self.rip_next_track() File "/home/rick/python/encoding.py", line 735, in rip_next_track self.add_image_to_oga() File "/home/rick/python/encoding.py", line 898, in add_image_to_oga picture.desc = self.song_name.encode('utf8') # Must be UTF-8 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)

Position 4 is where the unicode apostrophe is u\2019 if I remember correctly. So .encode('utf8') seems to have a problem encoding UTF-8 but no problem encoding ASCII.

On Wed, Oct 21, 2020 at 12:03 AM Christoph Reiter [email protected] wrote:

picture.desc = self.song_name

I don't know where it's coming from, but self.song_name needs to be unicode. In fact, every text you pass to mutagen needs to be unicode.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quodlibet/mutagen/issues/499#issuecomment-713325940, or unsubscribe https://github.com/notifications/unsubscribe-auth/AICIBHYLHTJY2PRR65YRXW3SLZ2TZANCNFSM4SVD5MZA .

WinEunuuchs2Unix avatar Oct 21 '20 10:10 WinEunuuchs2Unix

A simpler demonstration of what I think is happening in flac.py:

$ python Python 2.7.12 (default, Oct 5 2020, 13:56:01) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information.

test=u"é" test2=test.encode('utf-8') test3=test2.encode('utf-8') Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

Of course from Python 2 to Python 3 they changed string to byte format:

$ python3 Python 3.5.2 (default, Oct 7 2020, 17:19:02) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information.

test=u"é" test2=test.encode('utf-8') test3=test2.encode('utf-8') Traceback (most recent call last): File "", line 1, in AttributeError: 'bytes' object has no attribute 'encode'

On Wed, Oct 21, 2020 at 12:03 AM Christoph Reiter [email protected] wrote:

picture.desc = self.song_name

I don't know where it's coming from, but self.song_name needs to be unicode. In fact, every text you pass to mutagen needs to be unicode.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quodlibet/mutagen/issues/499#issuecomment-713325940, or unsubscribe https://github.com/notifications/unsubscribe-auth/AICIBHYLHTJY2PRR65YRXW3SLZ2TZANCNFSM4SVD5MZA .

WinEunuuchs2Unix avatar Oct 21 '20 11:10 WinEunuuchs2Unix

It already was encoded UTF-8

as I said, every string you pass to mutagen needs to be of type unicode.

lazka avatar Oct 21 '20 15:10 lazka

I just found this stack overflow Q&A that highlights the same problem of trying to encode as 'utf-8' twice. https://stackoverflow.com/questions/31393315/how-to-allow-encodeutf-8-twice-without-getting-error-in-python

On Wed, Oct 21, 2020, 9:37 AM Christoph Reiter, [email protected] wrote:

It already was encoded UTF-8

as I said, every string you pass to mutagen needs to be of type unicode.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quodlibet/mutagen/issues/499#issuecomment-713665592, or unsubscribe https://github.com/notifications/unsubscribe-auth/AICIBH4PPEPGCGM74QIDH6TSL352NANCNFSM4SVD5MZA .

WinEunuuchs2Unix avatar Oct 22 '20 21:10 WinEunuuchs2Unix

I am going to close this. As far as I can see the problems where twofold:

  • Attempt to pass a non-Unicode string as description with Python 2 ("" instead of u""). Support for Python 2 was officially dropped with the 1.44.0 release
  • With python 3 passing a bytes object instead of a string

The correct way to handle this in Python 3 is to pass a string:

picture.desc = "Some description"

In Python 2 it would need to be a unicode string u"Some description

phw avatar Feb 15 '23 07:02 phw