audiobookshelf icon indicating copy to clipboard operation
audiobookshelf copied to clipboard

[Bug]: Published year not displaying on some scraped books if date is encoded as yyyy-mm-dd

Open tehguitarist opened this issue 1 year ago • 5 comments

Describe the issue

This is something I noticed sometime around v2.4, but I noticed some books didn't have published years, and when they were quick matched, the year didn't populate. When doing a manual match, the year shows up as the year, with some text underneath saying "Currently: yyyy-mm-dd"

It would be good to either have a method to migrate these dates to years, or for ABS to ignore the trailing numbers, given all years should be 4 characters. I suppose an edge case might be mm-dd-yyyy or dd-mm-yyyy, so a test could be used to check if the year is at the front or back, looking for the dashes, or if the 4 characters fall between -5000 to 3000 (to cover first recorded books, into the next thousand years).

Steps to reproduce the issue

  1. Encode the date of a book as yyyy-mm-dd e.g 2013-1-1
  2. Observe that there is no date in the Published Date metadata field under Edit Metadata on an audiobook.
  3. Observe in manual matching that it is listed as "Currently 2013.1.1

Audiobookshelf version

2.5.0

How are you running audiobookshelf?

Docker

tehguitarist avatar Nov 01 '23 04:11 tehguitarist

Upon some investigation, it looks like many tags have Published Year as the published date. I'll see if I can drum up a python script to fix this, if it's not in scope for ABS

tehguitarist avatar Nov 01 '23 05:11 tehguitarist

Is this similar? https://github.com/advplyr/audiobookshelf/issues/2258#issuecomment-1783913722

nichwall avatar Nov 01 '23 06:11 nichwall

I don't think so. Looking at one of the newly created metadata.json files from the recent update I'm seeing this:

"publishedYear": "2023-01-01", "publishedDate": null,

So the fields are just back to front. This script can be used to fix it - I'll probably combine it with a bash find command like the below (linux/mac)

Just pass the path of the file as an argument Here is an example script to run it if the below code is saved somewhere as datefixer.py

for i in "$(find ~+ -name 'metadata.json' -type f -print0 | xargs -0 echo)"; 
do
    python /path/to/datefixer.py $i
done
import json
import os
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('filePath')
args = parser.parse_args()
filename = args.filePath

with open(filename, 'r') as f:
    data = json.load(f)
    if (data['publishedDate'] is None and data['publishedYear'] is not None):
        data['publishedDate'] = data['publishedYear']
    if ('-' not in data['publishedDate'][:4]):
        print("true!!")
        data['publishedYear'] = data['publishedDate'][:4]
    elif ('-' not in data['publishedYear'][-4:]):
        data['publishedYear'] = data['publishedDate'][-4:]

os.remove(filename)
with open(filename, 'w') as f:
    json.dump(data, f, indent=4)

Though now I test it, I realise modifying the metadata.json doesn't change the metadata in ABS (without rescanning).

tehguitarist avatar Nov 01 '23 06:11 tehguitarist

Currently published date isn't used anywhere in the UI

advplyr avatar Nov 10 '23 22:11 advplyr

https://github.com/id3/ID3v2.4/blob/master/id3v2.4.0-changes.txt id3v2.4 uses new frames for dates (<=v2.3 ones are deprecated), with following formats allowed: yyyy yyyy-MM yyyy-MM-dd yyyy-MM-ddTHH yyyy-MM-ddTHH:mm yyyy-MM-ddTHH:mm:ss same issue happens with mp4/m4a. some tools enforce strict format e.g. exiftool response to trying to set "Content Create Date" which is presumably used for "Publishing Year" in ABS

Warning: Invalid date/time (use YYYY:mm:dd HH:MM:SS[.ss][+/-HH:MM|Z]) in ItemList:ContentCreateDate (PrintConvInv)
Nothing to do.

would it be possible to strip everything after "year"?

scapix avatar May 04 '24 09:05 scapix