[Bug] Captions API does not error out properly
Describe the bug The captions API is not returning the right kind of error when it fails. Obviously this only happens when Google blocks access.
There's two problems to address
- Why Google is blocking access to caption
- Why Invidious is returning wrong content-type and wrong http status code
Steps to Reproduce Obviously this is only reproducible if caption gets blocked...
- Open https://vid.puffyan.us/api/v1/captions/k2RKtUh6m3Q?label=English in browser
- Observe returned data, content, and headers
Logs
- Content:
<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"/><title>Sorry...</title><style> body { font-family: verdana, arial, sans-serif; background-color: #fff; color: #000; }</style></head><body><div><table><tr><td><b><font face=sans-serif size=10><font color=#4285f4>G</font><font color=#ea4335>o</font><font color=#fbbc05>o</font><font color=#4285f4>g</font><font color=#34a853>l</font><font color=#ea4335>e</font></font></b></td><td style="text-align: left; vertical-align: bottom; padding-bottom: 15px; width: 50%"><div style="border-bottom: 1px solid #dfdfdf;">Sorry...</div></td></tr></table></div><div style="margin-left: 4em;"><h1>We're sorry...</h1><p>... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.</p></div><div style="margin-left: 4em;">See <a href="https://support.google.com/websearch/answer/86640">Google Help</a> for more information.<br/><br/></div><div style="text-align: center; border-top: 1px solid #dfdfdf;"><a href="https://www.google.com">Google Home</a></div></body></html>
Content text:
We're sorry...
... but your computer or network may be sending automated queries.
To protect our users, we can't process your request right now.
- Status Code
200 OK - Headers
access-control-allow-origin: *
content-type: text/vtt; charset=UTF-8 <========= notice how it advertises text/vtt content, but the body is HTML
content-encoding: gzip
content-security-policy: default-src 'none'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self' data:; connect-src 'self'; manifest-src 'self'; media-src 'self' blob: https://*.googlevideo.com:443 https://*.youtube.com:443; child-src 'self' blob:; frame-src 'self'; frame-ancestors 'none'
date: Sun, 06 Nov 2022 00:35:34 GMT
onion-location: <URL>
permissions-policy: interest-cohort=()
referrer-policy: same-origin
server: nginx/1.18.0
strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=63072000; includeSubDomains; preload
x-content-type-options: nosniff
x-frame-options: sameorigin
x-xss-protection: 1; mode=block
Using [Version 1.48.158 Chromium: 110.0.5481.77 (Official Build) (64-bit)]
Brave shields off.
Extensions: Dark Reader, SimpleLogin (Email relay), LibRedirect, Google Analytics Opt-out Add-on, ClearURLs.
I turned off all of these and problem still persists.
I tried to turn on captions on this video https://inv.vern.cc/watch?v=zuAUjOcby0g
Result:

Request URL: https://inv.vern.cc/api/v1/annotations/zuAUjOcby0g Request Method: GET Status Code: 500 Remote Address: 167.114.67.70:443 (It's not my ip) Referrer Policy: same-origin access-control-allow-origin: * content-encoding: gzip content-security-policy: default-src 'none'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self' data:; connect-src 'self'; manifest-src 'self'; media-src 'self' blob:; child-src 'self' blob:; frame-src 'self'; frame-ancestors 'none' content-type: text/xml date: Wed, 15 Feb 2023 23:44:27 GMT permissions-policy: interest-cohort=() referrer-policy: same-origin server: nginx/1.18.0 strict-transport-security: max-age=31536000; includeSubDomains; preload x-content-type-options: nosniff x-frame-options: sameorigin x-xss-protection: 1; mode=block :authority: inv.vern.cc :method: GET :path: /api/v1/annotations/zuAUjOcby0g :scheme: https accept: / accept-encoding: gzip, deflate, br accept-language: en-US,en;q=0.9 cookie: SID=kbX4-hNRJyihfyLzzfDdjjCwPlq0-yxDtAeevUTzARY=; PREFS=%7B%22volume%22%3A40%2C%22speed%22%3A1%7D dnt: 1 referer: https://inv.vern.cc/watch?v=zuAUjOcby0g sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Brave";v="110" sec-ch-ua-mobile: ?0 sec-ch-ua-platform: "Windows" sec-fetch-dest: empty sec-fetch-mode: cors sec-fetch-site: same-origin sec-gpc: 1 user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36
This issue has been automatically marked as stale and will be closed in 30 days because it has not had recent activity and is much likely outdated. If you think this issue is still relevant and applicable, you just have to post a comment and it will be unmarked.
Still returns a 200 and a text/vtt content type even when body is not valid WebVTT