streamrip icon indicating copy to clipboard operation
streamrip copied to clipboard

[FEATURE] set MD5 checksum of Qobuz sourced FLAC files post-rip

Open 999wqe9q9ewq9 opened this issue 1 year ago • 1 comments

Is the feature request related to a problem. Please describe it.

Qobuz sourced FLAC files do not have their MD5 signature set in STREAMINFO. This is not a fault of streamrip but of Qobuz.

When testing flac files sourced from qobuz using flac -t you will see that such files report back "WARNING, cannot check MD5 signature since it was unset in the STREAMINFO"

This does not affect playback however, it's strictly missing the md5 which can be obtained by decoding/encoding the raw audio data.

Describe the solution you would like.

Streamrip performs either a re-encode of each flac file (from qobuz) post-rip using a command such as flac -f8 *.flac to set the MD5 checksum or, only decodes each file, calculates the MD5 signature from the raw audio and sets it.

An existing python script requiring flac and mutagen can be used as a reference point for how one decodes the raw audio and sets the MD5 without having to re-encode. This process is slightly faster than re-encoding the files in parallel but either way works as long as the end result is that the md5 checksum is set after a qobuz rip finishes.

import sys
import os
import logging
import subprocess as sp
import argparse
from multiprocessing import pool
from hashlib import md5
from mutagen import flac

# edit this:
FLAC_PROG = "P:\\Path\to\\flac.exe"
# --------------------

logger = logging.getLogger(__name__)
CHUNK_SIZE = 512 * 1024


def scantree(path: str, recursive=False):
    for entry in os.scandir(path):
        if entry.is_dir():
            if recursive:
                yield from scantree(entry.path, recursive)
        else:
            yield entry


def get_flac(path: str):
    try:
        return flac.FLAC(path)
    except flac.FLACNoHeaderError:  # file is not flac
        return
    except flac.error as e:  # file < 4 bytes
        if str(e).startswith('file said 4 bytes'):
            return
        else:
            raise e


def get_flacs_no_md5(path: str, recursive=False):
    for entry in scantree(path, recursive):
        flac_thing = get_flac(entry.path)
        if flac_thing is not None and flac_thing.info.md5_signature == 0:
            yield flac_thing


def get_md5(flac_path: str) -> str:
    md_five = md5()
    with sp.Popen(
            [FLAC_PROG, '-ds', '--stdout', '--force-raw-format', '--endian=little', '--sign=signed', flac_path],
            stdout=sp.PIPE,
            stderr=sp.DEVNULL) as decoding:
        for chunk in iter(lambda: decoding.stdout.read(CHUNK_SIZE), b''):
            md_five.update(chunk)

    return md_five.hexdigest()


def set_md5(flac_thing: flac.FLAC):
    md5_hex = get_md5(flac_thing.filename)
    flac_thing.info.md5_signature = int(md5_hex, 16)
    flac_thing.tags.vendor = 'MD5 added'
    flac_thing.save()
    return flac_thing


def main(path: str, recursive=False, check_only=False):
    found = False
    if check_only:
        for flac_thing in get_flacs_no_md5(path, recursive=recursive):
            logger.info(flac_thing.filename)
            found = True
    else:
        with pool.ThreadPool() as tpool:
            for flac_thing in tpool.imap(set_md5, get_flacs_no_md5(path, recursive=recursive)):
                logger.info(f'MD5 added: {flac_thing.filename}')
                found = True
    if not found:
        logger.info('No flacs without MD5 found')


def parse_args():
    parser = argparse.ArgumentParser(prog='Add MD5')
    parser.add_argument('dirpath')
    parser.add_argument('-r', '--recursive', help='Include subdirs', action='store_true')
    parser.add_argument('-c', '--check_only', help='don\'t add MD5s, just print the flacs that don\'t have them.',
                        action='store_true')
    args = parser.parse_args()

    return args.dirpath, args.recursive, args.check_only


if __name__ == '__main__':
    logger.setLevel(10)
    logger.addHandler(logging.StreamHandler(stream=sys.stdout))
    main(*parse_args())
usage: script_name  [-h] [-r] [-c] dirpath

positional arguments:
  dirpath

options:
  -h, --help        show this help message and exit
  -r, --recursive   Include subdirs
  -c, --check_only  don't add MD5s, just print the flacs that don't have them.

Describe alternatives you've considered.

You can write a wrapper to do this instead but having it be a flag or perhaps default behavior for sourcing from Qobuz may be beneficial to others

999wqe9q9ewq9 avatar Jun 18 '24 20:06 999wqe9q9ewq9

i use the same script posted above but would be nice to have it integrated and one less command I need to run (or function in my alias)

JFizDaWiz avatar Jul 25 '24 20:07 JFizDaWiz