[BUG]: Thumbnails not downloaded, curl error 404, when scraping subscriptions from a file
Describe the bug
When running ytfzf -t -f -c SI thumbnails are not downloaded and some curl 404 errors go to stderr:
] > ytfzf --thumbnail-log=log.txt -t -f -c SI
Scraping subscriptions with instance: https://invidious.esmailelbob.xyz
DL% UL% Dled Uled Xfers Live Total Current Left Speed
-- -- 1150k 0 18 0 --:--:-- 0:00:05 --:--:-- 212k
Fetching thumbnails...
DL% UL% Dled Uled Xfers Live Total Current Left Speed
-- -- 0 0 36 36 --:--:-- 0:00:03 --:--:-- 0 curl: (22) The requested URL returned error: 404
-- -- 0 0 36 35 --:--:-- 0:00:06 --:--:-- 0 curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
-- -- 0 0 36 32 --:--:-- 0:00:07 --:--:-- 0 curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
-- -- 0 0 36 17 --:--:-- 0:00:09 --:--:-- 0 curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
-- -- 0 0 36 15 --:--:-- 0:00:11 --:--:-- 0 curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
-- -- 0 0 36 11 --:--:-- 0:00:11 --:--:-- 0 curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
-- -- 0 0 36 6 --:--:-- 0:00:12 --:--:-- 0 curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
-- -- 0 0 36 4 --:--:-- 0:00:13 --:--:-- 0 curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
curl: (22) The requested URL returned error: 404
0 -- 0 0 36 0 --:--:-- 0:00:14 --:--:-- 0
To Reproduce
run ytfzf -t -f -c SI with the following subscriptions file:
https://www.youtube.com/channel/UC4PIO2pZaFKzI97uumFTNSg/videos # OficineRobotica
https://www.youtube.com/channel/UCeKpbMimEGgLM_0tnghfoVw/videos # Clough42
https://www.youtube.com/channel/UC7pokUsRb6q2B0FOzSqQLlw/videos # Adventures in creation
https://www.youtube.com/channel/UCw3UZn1tcVe7pH3R6C3Gcng/videos # Abom79
https://www.youtube.com/channel/UCCkSr3M8GXbS4txqPY7OMxQ/videos # Edge Precision
https://www.youtube.com/channel/UC7Jf7t6BL4e74O53dL6arSw/videos # Blondihacks
https://www.youtube.com/channel/UCY8gSLTqvs38bR9X061jFWw/videos # Stefan Gotteswinter
https://www.youtube.com/channel/UC-CubOaooNwC-3RBKUoAOQQ/videos # Joko Engineeringhelp
https://www.youtube.com/channel/UC7aAyIrjeH2RKciAXzdOaJA/videos # Artisan Makes
https://www.youtube.com/channel/UCKLIIdKEpjAnn8E76KP7sQg/videos # mrpete222
https://www.youtube.com/channel/UCyjwQ6oz4cqqtEcWGboSU3g/videos # Keith Rucker - VintageMachinery.org
https://www.youtube.com/channel/UChIs72whgZI9w6d6FhwGGHA/videos # Gamers Nexus
https://www.youtube.com/channel/UCVI8Mfisni3GaobL1e2JOIQ/videos # Inheritance Machining
https://www.youtube.com/channel/UC2wdo5vU7bPBNzyC2nnwmNQ/videos # Cutting Edge Engineering Australia
https://www.youtube.com/channel/UCworsKCR-Sx6R6-BnIjS2MA/videos # Clickspring
https://www.youtube.com/channel/UC9UjDtkpr2I-5G51vMJZvnA/videos # ClickspringClips
https://www.youtube.com/channel/UCiDJtJKMICpb9B1qf7qjEOA/videos # Adam Savage’s Tested
https://www.youtube.com/channel/UCB0wPMJJ2FKqdB-gx7YVsDg/videos # Matty’s Workshop
Expected behavior
Thumbnails similar to those displayed when using the invidious-channel feature
Screenshots

Information
- OS: Archlinux
- Terminal:
alacritty - Ytfzf version:
ytfzf: 2.5.5(from aurytfzf-git r1963.ac4cc79-1) - Output of
ls -l "$(which sh)"(if you're using fish:ls -l (which sh)):lrwxrwxrwx 1 root root 4 Jan 8 2022 /usr/sbin/sh -> bash* - (if is a thumbnail issue) run
ytfzf --thumbnail-log=log.txtand post the file: The file is empty
Additional context
I did some testing using bash -x to get debug output. Here's download log output from a working invidious-channel scrape:
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/https:__www.youtube.com_channel_UCB0wPMJJ2FKqdB-gx7YVsDg_videos # Matty’s Workshop-1102122/thumbnails/%s.jpg"\n' https://iv.melmac.space/vi/i6WIRWdGUPg/hqdefault.jpg i6WIRWdGUPg
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/https:__www.youtube.com_channel_UCB0wPMJJ2FKqdB-gx7YVsDg_videos # Matty’s Workshop-1102122/thumbnails/%s.jpg"\n' https://iv.melmac.space/vi/J2zZhThFurg/hqdefault.jpg J2zZhThFurg
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/https:__www.youtube.com_channel_UCB0wPMJJ2FKqdB-gx7YVsDg_videos # Matty’s Workshop-1102122/thumbnails/%s.jpg"\n' https://iv.melmac.space/vi/B4u8MpH9db8/hqdefault.jpg B4u8MpH9db8
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/https:__www.youtube.com_channel_UCB0wPMJJ2FKqdB-gx7YVsDg_videos # Matty’s Workshop-1102122/thumbnails/%s.jpg"\n' https://iv.melmac.space/vi/nIDQzpBLqFo/hqdefault.jpg nIDQzpBLqFo
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/https:__www.youtube.com_channel_UCB0wPMJJ2FKqdB-gx7YVsDg_videos # Matty’s Workshop-1102122/thumbnails/%s.jpg"\n' https://iv.melmac.space/vi/0LbDxvvA8Ww/hqdefault.jpg 0LbDxvvA8Ww
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/https:__www.youtube.com_channel_UCB0wPMJJ2FKqdB-gx7YVsDg_videos # Matty’s Workshop-1102122/thumbnails/%s.jpg"\n' https://iv.melmac.space/vi/kQOF9cB7Gjw/hqdefault.jpg kQOF9cB7Gjw
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/https:__www.youtube.com_channel_UCB0wPMJJ2FKqdB-gx7YVsDg_videos # Matty’s Workshop-1102122/thumbnails/%s.jpg"\n' https://iv.melmac.space/vi/RlHteM78lDo/hqdefault.jpg RlHteM78lDo
and here's one from a not working -cSI scrape against my subscriptions file:
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/SCRAPE-SI-1100859/thumbnails/%s.jpg"\n' https://invidious.baczek.me/vi/eCDW3Xm_voE/high.jpg eCDW3Xm_voE
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/SCRAPE-SI-1100859/thumbnails/%s.jpg"\n' https://invidious.baczek.me/vi/x6LUpi6W3YA/high.jpg x6LUpi6W3YA
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/SCRAPE-SI-1100859/thumbnails/%s.jpg"\n' https://invidious.baczek.me/vi/jX9jzSfVrUA/high.jpg jX9jzSfVrUA
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/SCRAPE-SI-1100859/thumbnails/%s.jpg"\n' https://invidious.baczek.me/vi/1qtg1z5V1ss/high.jpg 1qtg1z5V1ss
+ for line in "$@"
+ printf 'url="%s"\noutput="/tmp/ytfzf-1000/SCRAPE-SI-1100859/thumbnails/%s.jpg"\n' https://invidious.baczek.me/vi/AXdFQga0i88/high.jpg AXdFQga0i88
+ curl -fLZ -K /tmp/ytfzf-1000/SCRAPE-SI-1100859/tmp/curl_config
I tried downloading the image from both. https://invidious.baczek.me/vi/1qtg1z5V1ss/hqdefault.jpg contains an image while https://invidious.baczek.me/vi/1qtg1z5V1ss/high.jpg does not. Though I cant figure out why the two types of scrapes request different quality images. I think this may not be related though since neither URL is a 404. I hope this is helpful though.
This is the only place thumbnails are broken for me. They work with all other searches and scrapes.
Thanks!
You could try using --thumbnail-quality=hqdefault, however using high works for me.
Could it be that this was a misunderstanding of the supported thumbnail types in invidious? There is no high url, but the name of the high thumbnail is hqdefault here: https://github.com/iv-org/invidious/blob/6837e4292829ee0891c73108096b806b63ab1506/src/invidious/videos.cr#L425
I've tried every instance I can find and none of them return anything for https://<instance_url>/vi/AKZRuNZDkGU/high.jpg but they all return an image for https://<instance_url>/vi/AKZRuNZDkGU/hqdefault.jpg
This makes me think the default quality should be hqdefault instead of high. But I'll gladly admit that I don't have a complete understanding of this codebase and could be completely wrong =] .
Also I can reproduce this very consistently with ytfzf --thumbnail-quality=high -t -f -c SI
tbh, high works for me 99% of the time, if this becomes a bigger issue i'll change the default to hqdefault. In the meantime, i'd suggest added thumbnail_quality=hqdefault to your config file.
edit:
Im kinda dumb, I didn't realize this only really affects subscriptions for some reason, and when scraping SI this bug appears a lot more often for me.
I think it's because scrape_SI... or maybe scrape_subscriptions doesn't call _get_invidious_thumb_quality_name but the other functions like scrape_invidious_playlist do?
You are converting high to hqdefault in that function but without it the thumbnail_quality variable is just high which is what's being passed to invidious in the url as far as I can tell.
_get_invidious_thumb_quality_name () {
case "$thumbnail_quality" in
high) thumbnail_quality="hqdefault" ;;
medium) thumbnail_quality="mqdefault" ;;
start) thumbnail_quality="1" ;;
middle) thumbnail_quality="2" ;;
end) thumbnail_quality="3" ;;
esac
}
PS. I have no idea how you stay organized in a 3565 line long file.... And also sorry if I'm way off here.
EDIT: I tried adding _get_invidious_thumb_quality_name to the scrape_SI function and it seems to have fixed it. Though not sure if that's the best solution.
I have no idea how you stay organized in a 3565 line long file
Its hard lol.
I think it's because scrape_SI... or maybe scrape_subscriptions doesn't call _get_invidious_thumb_quality_name
I think you're right. I will add this patch when I get home. I believe adding the function call is the best solution, but it might be better if it gets called automatically somewhere.
This should now be fixed in the development branch.
Thanks! Just tested and it's working great now.