audiobookshelf
audiobookshelf copied to clipboard
[Bug] Audiobook files are listed twice or more
What happened?
Some books (not all) have their library audio files listed twice or more. This is also represented in the audio tracks listing, where the same file is listed multiple times. It's also affecting (accumulatively) the stats.
Example 1
ABS
OS
Example 2
When downloading the audiobook using the app (Android 14, v0.9.74-beta), it actually tries to download each listed audio file (in example 1 above: 2 files). After the file is downloaded the first time, the process never completes.
I tried both rescanning the library and the individual items, to no avail.
As a side note, I think both the audiobooks in the examples are ones where I've changed the file path and/or name previously (after they were already added to the library), and then updated the library.
What did you expect to happen?
The audio file should be recognized only once.
Steps to reproduce the issue
- Add audiobooks to library
- (Possible related) rename and/or move files on OS
Audiobookshelf version
v2.10.1
How are you running audiobookshelf?
Docker
What OS is your Audiobookshelf server hosted from?
Linux
If the issue is being seen in the UI, what browsers are you seeing the problem on?
Other (list in "Additional Notes" box)
Logs
[2024-06-01 08:46:59.489] DEBUG: [Scan] "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition": Library file "metadata.json" for library item "/audiobooks/Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition" key "birthtimeMs" changed from "1717184994186" to "1717210836321" (ScanLogger.js:65)
[2024-06-01 08:46:59.491] DEBUG: [Scan] "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition": Library item "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition" changed: [size,lastScan] (ScanLogger.js:65)
[2024-06-01 08:47:00.251] DEBUG: [ApiCacheManager] libraryItem.afterUpdate: Clearing cache (ApiCacheManager.js:21)
[2024-06-01 08:47:00.310] DEBUG: [AudioFileScanner] Smart track order for "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition" using track key trackNumFromFilename (AudioFileScanner.js:94)
[2024-06-01 08:47:00.311] DEBUG: [Scan] "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition": "Silo Saga #2 - Shift Omnibus Edition" Getting metadata with precedence [folderStructure, txtFiles, opfFile, absMetadata, audioMetatags] (ScanLogger.js:65)
[2024-06-01 08:47:00.343] DEBUG: [Scan] "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition": Found metadata file "/audiobooks/Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition/metadata.json" (ScanLogger.js:65)
[2024-06-01 08:47:00.350] DEBUG: [Scan] "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition": setChapters: Using embedded chapters in first audio file /audiobooks/Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition/Hugh Howey - The Silo Saga #2 - 2013 - Shift Omnibus Edition.m4b (ScanLogger.js:65)
[2024-06-01 08:47:00.859] DEBUG: [ApiCacheManager] book.afterUpdate: Clearing cache (ApiCacheManager.js:21)
[2024-06-01 08:47:00.863] DEBUG: [Scan] "Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition": Success saving abmetadata to "/audiobooks/Howey, Hugh/Silo Saga #2 - Shift Omnibus Edition/metadata.json" (ScanLogger.js:65)
[2024-06-01 08:47:01.201] DEBUG: [ApiCacheManager] libraryItem.afterUpdate: Clearing cache (ApiCacheManager.js:21)
Additional Notes
- OS: Synology DSM 7.1.1-42962 Update 6
- Browser: Brave v1.66.118
Are you able to reproduce this again?
Are you able to reproduce this again?
Not on first try (renaming both file path and file name). As I mentioned in my initial post, I'm not sure if it only affects renamed files, thus I can't be sure it would be the actual reason.
I tried removing and re-adding one of the books to the library to solve the problem, which it did (audio file is only read once). But I'd obviously lose any progress/history for the affected books. No biggie, but rather unfortunate.
I think for me this is happening when my network drive has some issues and disconnects. At least I'm noticing the disconnects in other services and assumed it might also be the source of the duplicated files in audiobookshelf.
Maybe after reconnection it rescans and detects the old files as new ones and adds them? For me, only one of the audiofiles shows a file size, the others are just "NaN". Unfortunately, I cannot reproduce the issue reliably, but only notice duplicate files after a while.
Currently I'm fixing it by running this script directly against the SQLite database file:
-- Step 1: Extract and filter JSON objects from the `audiofiles` column
WITH ExtractedEntries AS (
SELECT
books.id AS book_id,
json_each.value AS json_object,
json_extract(json_each.value, '$.metadata.path') AS path,
json_extract(json_each.value, '$.addedAt') AS addedAt,
json_extract(json_each.value, '$.metadata.size') AS size
FROM books,
json_each(books.audiofiles)
),
-- Step 2: Filter out objects where `$.metadata.size` is NULL
FilteredEntries AS (
SELECT
book_id,
json_object,
path,
addedAt
FROM ExtractedEntries
WHERE size IS NOT NULL
),
-- Step 3: Deduplicate by keeping only the entry with the smallest `addedAt`
UniqueEntries AS (
SELECT
f1.book_id,
f1.json_object
FROM FilteredEntries f1
WHERE f1.addedAt = (
SELECT MIN(f2.addedAt)
FROM FilteredEntries f2
WHERE f2.book_id = f1.book_id
AND f2.path = f1.path
)
)
-- Step 4: Reconstruct the JSON array and update the `audiofiles` column
UPDATE books
SET audiofiles = (
SELECT '[' || GROUP_CONCAT(json_object) || ']'
FROM UniqueEntries
WHERE UniqueEntries.book_id = books.id
)
WHERE id IN (
SELECT DISTINCT book_id FROM UniqueEntries
)
AND id IN (
SELECT libraryItems.mediaId
FROM libraryItems
INNER JOIN libraries ON libraryItems.libraryId = libraries.id
WHERE libraries.name = 'Hörbücher' -- NOTE: replace with your library name or change the query to match other libraries
);
sqlite3 path/to/my/config/absdatabase.sqlite < path/to/above/sql/query/in/a/file.sql
Maybe this helps someone to at least work around the issue until a solution might be found.
Is this still an issue on the latest version v2.23.0? If so, can someone provide steps to reproduce it?
I haven't encountered this in a long time, so I'd be okay with closing this one as non-reproduceable.