UpdateTool stops at 'ImdbRatingDatasetFactory.readData'
UpdateTool always used to work without issues, until last week Plex notified me of a database corruption. After fixing this, UpdateTool always gets stuck at the same point:
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.run: LIBRARIES => POST LIBRARY FILTERING
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library Movies (ID=2) with agent: tv.plex.agents.movie and 5131 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library Movies (4K) (ID=12) with agent: tv.plex.agents.movie and 11 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library Documentaries (ID=25) with agent: tv.plex.agents.movie and 26 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library TV Shows (ID=5) with agent: tv.plex.agents.series and 28702 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDatabaseSupport.testPlexSqliteBinaryVersion: Plex SQLite binary version: 3.39.4 | ATOMIC_INTRINSICS=1 | COMPILER=clang-11.0.1 | DEFAULT_AUTOVACUUM | DEFAULT_CACHE_SIZE=-2000 | DEFAULT_FILE_FORMAT=4 | DEFAULT_JOURNAL_SIZE_LIMIT=-1 | DEFAULT_MMAP_SIZE=0 | DEFAULT_PAGE_SIZE=4096 | DEFAULT_PCACHE_INITSZ=20 | DEFAULT_RECURSIVE_TRIGGERS | DEFAULT_SECTOR_SIZE=4096 | DEFAULT_SYNCHRONOUS=2 | DEFAULT_WAL_AUTOCHECKPOINT=1000 | DEFAULT_WAL_SYNCHRONOUS=2 | DEFAULT_WORKER_THREADS=0 | ENABLE_COLUMN_METADATA | ENABLE_DBPAGE_VTAB | ENABLE_EXPLAIN_COMMENTS | ENABLE_FTS3 | ENABLE_FTS3_PARENTHESIS | ENABLE_ICU | ENABLE_RTREE | ENABLE_UNLOCK_NOTIFY | MALLOC_SOFT_LIMIT=1024 | MAX_ATTACHED=10 | MAX_COLUMN=2000 | MAX_COMPOUND_SELECT=500 | MAX_DEFAULT_PAGE_SIZE=8192 | MAX_EXPR_DEPTH=2048 | MAX_FUNCTION_ARG=127 | MAX_LENGTH=1000000000 | MAX_LIKE_PATTERN_LENGTH=50000 | MAX_MMAP_SIZE=0x7fff0000 | MAX_PAGE_COUNT=1073741823 | MAX_PAGE_SIZE=65536 | MAX_SQL_LENGTH=1000000000 | MAX_TRIGGER_DEPTH=1000 | MAX_VARIABLE_NUMBER=32766 | MAX_VDBE_OP=250000000 | MAX_WORKER_THREADS=8 | MUTEX_PTHREADS | OMIT_DEPRECATED | SYSTEM_MALLOC | TEMP_STORE=1 | THREADSAFE=1 |
[INFO ] - 2025-01-27 06:55:30 @ ImdbDatabaseSupport.<init>: NewExtraDataFormat has been identified as: true
[INFO ] - 2025-01-27 06:55:33 @ ImdbDockerImplementation$ImdbBatchJob.run: Library IDs on ignore list: [16]
[INFO ] - 2025-01-27 06:55:33 @ ImdbRatingDatasetFactory.readData: Reading data...
Any idea what could be wrong?
Do you run this in a docker or on bare metal?
If it gets stuck at that point something very weird I/O related must happen on your system:
I just tested it on my machine (TM)
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.requestSet: IMDB Dataset has the timestamp: 1730459790615 and violates the update every 86400000 ms constraint. Refreshing dataset...
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.downloadData: Downloading IMDB rating set from: https://datasets.imdbws.com/title.ratings.tsv.gz
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.downloadData: Download succeeded @ ./__tmp_rating.gz
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.extractData: Extracting dataset...
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.extractData: Extraction completed.
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.readData: Reading data...
[INFO ] - 2025-02-08 01:21:47 @ ImdbRatingDatasetFactory.readData: 1531107 lines read.
and it is working well.
https://github.com/mynttt/UpdateTool/blob/117b2e16be4bfdfb6ffdc8a848bae34341b8c32a/src/main/java/updatetool/imdb/ImdbRatingDatasetFactory.java#L154-L167
Looking at the code; between Reading data... and xyz lines read is only a section that reads the lines from the file. The buffered reader would also fail with an exception in case it would fail to open the file.
So if it still appears for some reason the file is not being processed (?) - would be interesting to know what is written in the file and if it is actually even a valid TSV file.
If I disable TV show parsing (NO_TV) there are no issues. I think it might be an issue of the size of my TV library (30.000 episodes). However, increasing the memory size from 256m to 512m actually causes some issues with my Unraid installation (99% docker CPU usage by UpdateTool, access to Unraid interface dropping out) so I think I might just leave NO_TV enabled... I mainly care about movie ratings anyway.
Last thing you could do is send me a link to your database via email. Would be interesting to see if this behavior is reproduced on my system.
Tell me what you need exactly and where to send it, and I'll send it over! I'll also include a screenshot of the docker settings.
Sure!
The file is usually located in:
Plex Media Folder/Plug-in Support/Databases
and named com.plexapp.plugins.library.db
In Unraid this will likely be in the data folder of the plex container.
You can then send it to [email protected]
Hi Marc, I am having the same issue as the original poster of this thread. My library also has thousands of items. The tool hangs at ImdbRatingDatasetFactory.readData: Reading data... Is there a way to update the show ratings only and not look for episode ratings? I am not interested in episode ratings at all. Just need the show ratings on the main page. That will reduce the number of items to process and the tool will probably not hang. Let me know if that is possible or if there is a cure for this hanging issue with large libraries. Thanks for all you help.
Working
https://github.com/yashu1wwww/Imdb-Movies-And-TV-Shows-Ratings-Count-Tool
Closed - discussion continues in #135
Thanks to @ShrinkWrapper the curlpit of this issue has been identified - it is now possible to create a fix for it; hope to get it done next week so all of you can use this tool again without having to increase memory.