ytarchive icon indicating copy to clipboard operation
ytarchive copied to clipboard

Download (for video segments) randomly stops and does not resume/retry

Open fireattack opened this issue 1 year ago • 13 comments

I've noticed this happening a lot lately.

Video segment download will stop at some point and never seem to recover. Re-start usually fixes it.

>ytarchive -w --verbose https://www.youtube.com/live/UbrJIO3rKRU best
ytarchive 0.4.0-68e1bf9
2024/09/05 20:47:46 Channel: KADOKAWAanime
2024/09/05 20:47:46 Video Title: ファンタジア文庫 オンラインフェスティバル2024

2024/09/05 20:47:46 Stream starts at 2024-09-06T10:00:00+00:00 in 76334 seconds.
2024/09/05 20:47:46 Waiting for this time to elapse...
2024/09/06 18:00:17 Stream is 15 seconds late...
2024/09/06 18:00:32 Stream is 30 seconds late...
2024/09/06 18:00:48 Stream is 45 seconds late...
2024/09/06 18:01:03 Stream is 60 seconds late...
2024/09/06 18:01:19 Stream is 75 seconds late...
2024/09/06 18:01:34 Stream is 90 seconds late...

2024/09/06 18:01:35 Selected quality: 1080p (h264)
2024/09/06 18:01:35 Stream started at time 2024-09-06T10:01:27+00:00
2024/09/06 18:01:35 INFO: Starting download to D:\UbrJIO3rKRU__1105781344\ファンタジア文庫 オンラインフェスティバル2024-UbrJIO3rKRU.f140.ts
2024/09/06 18:01:35 INFO: Starting download to D:\UbrJIO3rKRU__1105781344\ファンタジア文庫 オンラインフェスティバル2024-UbrJIO3rKRU.f137.ts
Video Fragments: 261; Audio Fragments: 5660; Max Fragments: 5659; Max Sequence: 5659; Total Downloaded: 212.50MiB
2024/09/06 19:37:49 WARNING: User Interrupt, Stopping download...

fireattack avatar Sep 06 '24 11:09 fireattack

Can you please use the --debug option in the future?

Kethsar avatar Sep 07 '24 08:09 Kethsar

I've tried --trace -- which I assume would print even more than --debug? But didn't find anything interesting. Anyway, I will do in future.

fireattack avatar Sep 07 '24 10:09 fireattack

image

After months I finally reproduce this. Sorry for the screenshot because for some reason I cannot select text when ytarchive is running.

Jus to make it clear, re-creating the task can download pass this fragment easily.

fireattack avatar Nov 27 '24 10:11 fireattack

That's definitely odd to say the least, but not sure I can do anything about it. If it were a deadlock, at least I would know there's an issue in the code. Youtube causing a permanent 403 on a fragment for that specific attempt is not something I can control.

Kethsar avatar Nov 27 '24 14:11 Kethsar

I assume it works fine if I created a new task, is because it will have entirely different auth info / query parameters / session etc..

I was thinking maybe we can do something similar within the same process, if it failed for too long? No idea how difficult it would be, though, just an idea.

fireattack avatar Nov 27 '24 14:11 fireattack

I'm not sure it would, actually. The main factor that would change anything would be creating a new http client, I think, since all other info used is constant. There is no different auth info or query params in a newly started instance. But I just grab a copy of the default http client in Go and make some slight changes to its parameters. I think trying to create a new client while the program is running won't change anything because of that, since it will presumably use the same base instance and thus have the same fingerprint as far as Youtube would see it.

Thinking about it, maybe re-loading the cookies from file and replacing the cookie jar might do something. But it also might not. I'll consider it though.

Kethsar avatar Nov 27 '24 15:11 Kethsar

Thanks. For what it's worth, I usually download anonymously.

fireattack avatar Nov 27 '24 15:11 fireattack

Another idea: maybe ytarchive retrying it too frequently is what causing the constant 403 (and the reason re-starting the process can fix it is due to (manually) introduced cooldown time in between). From the log, it is doing it once per second at least, that definitely sounds excessive.

We probably can add some exponential backoffs on these segment retries.

fireattack avatar Nov 29 '24 03:11 fireattack

Nah, it already waits something like 15 seconds minimum. It doesn't grab new URLs unless you see the "Retrieving URLS..." message.

Kethsar avatar Nov 29 '24 04:11 Kethsar

It did say "attempting to retrieve a new download URL" every second in the screenshot I posted above. Does this not count?

fireattack avatar Nov 29 '24 05:11 fireattack

Nope, since the actual function that tries checks how long it has been since it last grabbed them.

Kethsar avatar Nov 29 '24 06:11 Kethsar

Then I'm confused.

It says it retried 10 times from 18:30:47 to 10:30:58. If it's not making new HTTP request what exactly was it retrying?

I assume you meant the actual function responsible for making HTTP request won't do it if the attempt was too close to the previous one. I.e. it doesn't actually retry 10 times, just wrong print?

fireattack avatar Nov 29 '24 06:11 fireattack

Nothing. It's just a debug message that fires whenever it thinks it should try, before it calls the actual function that tries. That's why it appears, even if it doesn't end up making a full attempt. It's not really a wrong print, it's there because there are some errors where it won't immediately try to grab a new set of URLs, and some where it does. This helps me know which case is which when a user posts their logs.

Kethsar avatar Nov 29 '24 06:11 Kethsar