UpdateTool icon indicating copy to clipboard operation
UpdateTool copied to clipboard

UpdateTool stops at 'ImdbRatingDatasetFactory.readData'

Open x5nder opened this issue 11 months ago • 5 comments

UpdateTool always used to work without issues, until last week Plex notified me of a database corruption. After fixing this, UpdateTool always gets stuck at the same point:

[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.run: LIBRARIES => POST LIBRARY FILTERING
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library Movies (ID=2) with agent: tv.plex.agents.movie and 5131 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library Movies (4K) (ID=12) with agent: tv.plex.agents.movie and 11 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library Documentaries (ID=25) with agent: tv.plex.agents.movie and 26 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDockerImplementation$ImdbBatchJob.lambda$run$3: Will process library TV Shows (ID=5) with agent: tv.plex.agents.series and 28702 item(s).
[INFO ] - 2025-01-27 06:55:30 @ ImdbDatabaseSupport.testPlexSqliteBinaryVersion: Plex SQLite binary version: 3.39.4 | ATOMIC_INTRINSICS=1 | COMPILER=clang-11.0.1 | DEFAULT_AUTOVACUUM | DEFAULT_CACHE_SIZE=-2000 | DEFAULT_FILE_FORMAT=4 | DEFAULT_JOURNAL_SIZE_LIMIT=-1 | DEFAULT_MMAP_SIZE=0 | DEFAULT_PAGE_SIZE=4096 | DEFAULT_PCACHE_INITSZ=20 | DEFAULT_RECURSIVE_TRIGGERS | DEFAULT_SECTOR_SIZE=4096 | DEFAULT_SYNCHRONOUS=2 | DEFAULT_WAL_AUTOCHECKPOINT=1000 | DEFAULT_WAL_SYNCHRONOUS=2 | DEFAULT_WORKER_THREADS=0 | ENABLE_COLUMN_METADATA | ENABLE_DBPAGE_VTAB | ENABLE_EXPLAIN_COMMENTS | ENABLE_FTS3 | ENABLE_FTS3_PARENTHESIS | ENABLE_ICU | ENABLE_RTREE | ENABLE_UNLOCK_NOTIFY | MALLOC_SOFT_LIMIT=1024 | MAX_ATTACHED=10 | MAX_COLUMN=2000 | MAX_COMPOUND_SELECT=500 | MAX_DEFAULT_PAGE_SIZE=8192 | MAX_EXPR_DEPTH=2048 | MAX_FUNCTION_ARG=127 | MAX_LENGTH=1000000000 | MAX_LIKE_PATTERN_LENGTH=50000 | MAX_MMAP_SIZE=0x7fff0000 | MAX_PAGE_COUNT=1073741823 | MAX_PAGE_SIZE=65536 | MAX_SQL_LENGTH=1000000000 | MAX_TRIGGER_DEPTH=1000 | MAX_VARIABLE_NUMBER=32766 | MAX_VDBE_OP=250000000 | MAX_WORKER_THREADS=8 | MUTEX_PTHREADS | OMIT_DEPRECATED | SYSTEM_MALLOC | TEMP_STORE=1 | THREADSAFE=1 | 
[INFO ] - 2025-01-27 06:55:30 @ ImdbDatabaseSupport.<init>: NewExtraDataFormat has been identified as: true
[INFO ] - 2025-01-27 06:55:33 @ ImdbDockerImplementation$ImdbBatchJob.run: Library IDs on ignore list: [16]
[INFO ] - 2025-01-27 06:55:33 @ ImdbRatingDatasetFactory.readData: Reading data...

Any idea what could be wrong?

x5nder avatar Jan 27 '25 15:01 x5nder

Do you run this in a docker or on bare metal?

If it gets stuck at that point something very weird I/O related must happen on your system:

I just tested it on my machine (TM)

[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.requestSet: IMDB Dataset has the timestamp: 1730459790615 and violates the update every 86400000 ms constraint. Refreshing dataset...
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.downloadData: Downloading IMDB rating set from: https://datasets.imdbws.com/title.ratings.tsv.gz
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.downloadData: Download succeeded @ ./__tmp_rating.gz
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.extractData: Extracting dataset...
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.extractData: Extraction completed.
[INFO ] - 2025-02-08 01:21:45 @ ImdbRatingDatasetFactory.readData: Reading data...
[INFO ] - 2025-02-08 01:21:47 @ ImdbRatingDatasetFactory.readData: 1531107 lines read.

and it is working well.

https://github.com/mynttt/UpdateTool/blob/117b2e16be4bfdfb6ffdc8a848bae34341b8c32a/src/main/java/updatetool/imdb/ImdbRatingDatasetFactory.java#L154-L167

Looking at the code; between Reading data... and xyz lines read is only a section that reads the lines from the file. The buffered reader would also fail with an exception in case it would fail to open the file.

So if it still appears for some reason the file is not being processed (?) - would be interesting to know what is written in the file and if it is actually even a valid TSV file.

mynttt avatar Feb 08 '25 00:02 mynttt

If I disable TV show parsing (NO_TV) there are no issues. I think it might be an issue of the size of my TV library (30.000 episodes). However, increasing the memory size from 256m to 512m actually causes some issues with my Unraid installation (99% docker CPU usage by UpdateTool, access to Unraid interface dropping out) so I think I might just leave NO_TV enabled... I mainly care about movie ratings anyway.

x5nder avatar Mar 16 '25 11:03 x5nder

Last thing you could do is send me a link to your database via email. Would be interesting to see if this behavior is reproduced on my system.

mynttt avatar Mar 16 '25 11:03 mynttt

Tell me what you need exactly and where to send it, and I'll send it over! I'll also include a screenshot of the docker settings.

x5nder avatar Mar 16 '25 11:03 x5nder

Sure!

The file is usually located in:

Plex Media Folder/Plug-in Support/Databases

and named com.plexapp.plugins.library.db

In Unraid this will likely be in the data folder of the plex container.

You can then send it to [email protected]

mynttt avatar Mar 16 '25 11:03 mynttt

Hi Marc, I am having the same issue as the original poster of this thread. My library also has thousands of items. The tool hangs at ImdbRatingDatasetFactory.readData: Reading data... Is there a way to update the show ratings only and not look for episode ratings? I am not interested in episode ratings at all. Just need the show ratings on the main page. That will reduce the number of items to process and the tool will probably not hang. Let me know if that is possible or if there is a cure for this hanging issue with large libraries. Thanks for all you help.

palwalus avatar Aug 25 '25 17:08 palwalus

Working

https://github.com/yashu1wwww/Imdb-Movies-And-TV-Shows-Ratings-Count-Tool

yashu1wwww avatar Aug 26 '25 05:08 yashu1wwww

Closed - discussion continues in #135

Thanks to @ShrinkWrapper the curlpit of this issue has been identified - it is now possible to create a fix for it; hope to get it done next week so all of you can use this tool again without having to increase memory.

mynttt avatar Oct 09 '25 23:10 mynttt