no element found
DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.
To Reproduce
Steps to reproduce the behavior:
youtube_transcript_api CHvUp1rynek
no element found: line 1, column 0
What code / cli command are you executing?
For example: I am running
youtube_transcript_api CHvUp1rynek
Which Python version are you using?
Python 3.12.10
Which version of youtube-transcript-api are you using?
youtube-transcript-api 1.0.3
Expected behavior
Describe what you expected to happen.
For example: I expected to receive the english transcript
Actual behaviour
Describe what is happening instead of the Expected behavior. Add error messages if there are any.
For example: Instead I received the following error message:
# ... error message ...
https://github.com/Kakulukian/youtube-transcript/issues/45#issuecomment-2953657921
this happened to me also but I avoided it entirely by passing the cookies file
this happened to me also but I avoided it entirely by passing the cookies file
For me, with or without cookie file makes no difference
I'm running into this as well.
@muflone Could you share more about your workaround?
This is currently working for me for version 1.0.3:
youtube_transcript_api foA4Sl_xlMc --language it --cookies ./cookies.txt --format text
Running into the same issue ! I am using the SDK btw. I made sure EN manual transcripts were available then fetched them but stumbled into this error. My code didn't change for months+ but his have been happening for days.
Also having this issue. I need around 5 retries to get a transcript now. It was working much better two weeks ago for the same videos.
It seems that YouTube raised its guards; it is also discussed here: #414
Also having this issue. I need around 5 retries to get a transcript now. It was working much better two weeks ago for the same videos.
It seems that YouTube raised its guards; it is also discussed here: #414
@danrosenberg ah interesting I think I stop at 5 retries using residential proxy. Seems like it's a matter of retrying and still using this versus a nodejs library.
Running into the same issue ! I am using the SDK btw. I made sure EN manual transcripts were available then fetched them but stumbled into this error. My code didn't change for months+ but his have been happening for days.
@maherbel infirst encountered the error using SDK. But for the same issue for the same video via CLI.
Is it possible that the format returned has changed? That's what I gathered from reading the comments on a nodeJS library encountering issues. The error reads like misformatted json. But I can't figure out what the change needs to be in the SDK itself without further telemetry like seeing the payload being parsed.
IMHO it is the PO token stuff mentioned in #414, particularly via Enrique's thread ending here: https://github.com/jdepoix/youtube-transcript-api/issues/414#issuecomment-2949257318
I have CI tests, and these went from failing once every couple of weeks. Starting last week-ish, it went to once a day, then to all the time over last week.
Unfortunately the PO token stuff isn't simple or easy -- until now, this approach worked very well for users being able to get transcripts on device.
There's a bunch of prior art for using PO tokens in various YouTube clients I've found on GitHub. But, I burned a lotttt of time on it, eventually got a PO token looking-thing. But it still didn't work. And it's not very fun trying to figure out why, it's not like you're getting helpful error messages :|
If anyone happens upon an known-good implementation for getting transcripts w/a PO token except yt-dlp, I'd be grateful for a ping -- there's too much indirection in the yt-dlp stuff for me to follow it fully, yet - because technically the PO token generator is feeding a plugin that feeds yt-dlp? If I understand correctly
Is it possible that the format returned has changed? That's what I gathered from reading the comments on a nodeJS library encountering issues. The error reads like misformatted json. But I can't figure out what the change needs to be in the SDK itself without further telemetry like seeing the payload being parsed.
right. seems like they fixed it without any PO token stuff?
Small update from me.
After 8 hours since last run, still the same error. I bought a proxy server, did a proxy api setup and now I get 429 Client Error.
I'm reading the source code now. (I hope I'll find something interesting.)
I wrote a python script to get transcripts, for the first few videos there was no error, but after a few different videos I did a while loop with sleep and tried to get it.
Now, after debugging the script I can't get any response
If the issue still persists with the cookie file, then format of the webpage has possibly changed
fetching timedtext universally requires POT now: https://github.com/yt-dlp/yt-dlp/wiki/PO-Token-Guide#introduction
related: https://github.com/yt-dlp/yt-dlp/issues/13075
Hi all!
Just letting you know that I am aware of the issue and investigating possible solutions!
As others have noted this is definitely caused by YouTube increasingly enforcing the use of PO tokens. I am still investigating if there are ways to fetch timedtext URLs that won't require a PO token. I am also investigating how we could generate PO tokens, but this is a non-trivial process that seems to be subject to frequent change, therefore I would prefer if I could find way to avoid having to maintain a PO token builder (or relying on a dependency for it)!
If you have any ideas on possible solutions or have looked into reverse-engineering the PO token generation, feel free to jump into the discussion and share your knowledge! 🙂
But please, for the time being, retain from adding "Same problem here" comments as this clutters the discussion, without adding anything to finding a solution 😉 (I will delete such comments, to allow for a more focused discussion)
As someone who just had this today:
YouTube is returning an empty 200 for the API call. I confirmed this by running curl 'https://www.youtube.com/api/timedtext?v=t_LvB6...DA&key=yt8&kind=asr&lang=en from multiple IPs
As similar things happen when your IP is blocked, I would suggest some error checking as a part of the package:
- Empty reply
- Anything that starts with
<!DOCTYPE html> - <2KB reply that cannot be automatically parsed (this is likely in the case of a text/HTML warning from YT)
@Joshfindit that's only because you are missing a lot of vital keys in the timedText json response, most notably POT.
@lucyknada good point. I was covering the request that youtube_transcript_api currently uses. My main goal was to say "this sort of thing will keep happening as time goes on. It would be helpful to wrap the code in checks and error messages so that users can clearly understand that these types of issues normally mean YouTube is cracking down again."
According to the poToken guide from yt-dlp, which seems to have few things figured out already, the PO token is NOT required for some clients, like TVs or Android VR. Maybe that's a good starting point. It's definitely a PO-Token problem and the solution is not easy.
Weirdly the sponsor SearchAPI of this project still has its service working like a charm. Is it just a question of proxy quality?
I am running into the same issue. Has anyone figured out a solution or a workaround?
Via Inner Tube API works fine but is also tight when it comes to proxies
As a follow up to my comment in #414, yt-dlp is at the current time able to retrieve captions without a PO token. It does that using a series of http calls that involve mimicking different clients (tv and ios in the example below). I incorporated yt_dlp in my app using their support for embedding it in a Python app. When using the "verbose": True and "debug_printtraffic": True params to YoutubeDL, you can see how it retrieves the captions using a series of http calls. Below is an example:
[youtube] 79jdKfRUqw0: Downloading webpage (GET /watch?v=)
[youtube] 79jdKfRUqw0: Downloading tv client config (GET /tv)
[debug] Loading youtube-sts.612f74a3-main from cache
[youtube] 79jdKfRUqw0: Downloading tv player API JSON (POST /youtubei/v1/player)
[youtube] 79jdKfRUqw0: Downloading ios player API JSON (POST /youtubei/v1/player)
[debug] Loading youtube-nsig.612f74a3-main from cache
It is able to do this because Google has not yet rolled out PO tokens to these clients, so in the future this may break too, but for now it is working using their latest nightly build.
Temporary Workaround for youtube-transcript-api Using yt-dlp
I faced the same issue and can confirm that using yt-dlp is currently a reliable workaround to fetch captions, even auto-generated ones.
To keep things simple, I used yt-dlp from the command line to download the .vtt subtitle files and then parsed them using a custom Python script.
Here’s what worked for me:
yt-dlp --write-auto-sub --sub-lang en --skip-download --convert-subs vtt "https://www.youtube.com/watch?v=VIDEO_ID"
This downloads the subtitles in .vtt format. I then used Python to strip the tags and merge the lines into a clean transcript string or JSON.
While this isn't a drop-in replacement for youtube-transcript-api, it can be a practical temporary fix until the PO token issue is resolved or a more robust patch is integrated into the library.
Happy to share my script if it helps anyone.
EDIT: Thanks to @grigio for pointing out '--write-auto-sub'
When I didn't have time to run scripts, I simply got subtitles to the clipboard directly from the browser via a bookmarklet. And it immediately cleared the subtitles from timestamps:
javascript: (async function() { try { let getSubs = async (langCode = 'en') => { let response = await fetch(window.location.href); let text = await response.text(); let ytData = text.split('ytInitialPlayerResponse = ')[1]?.split(';var')[0]; if (!ytData) throw new Error('Субтитры не найдены!'); let ct = JSON.parse(ytData).captions?.playerCaptionsTracklistRenderer?.captionTracks; if (!ct) throw new Error('Субтитры отсутствуют для этого видео!'); let findCaptionUrl = x => ct.find(y => y.vssId.indexOf(x) === 0)?.baseUrl; let firstChoice = findCaptionUrl("." + langCode); let url = firstChoice ? firstChoice + "&fmt=json3" : (findCaptionUrl(".") || findCaptionUrl("a." + langCode) || ct[0].baseUrl) + "&fmt=json3&tlang=" + langCode; let subsResponse = await fetch(url); let subsData = await subsResponse.json(); return subsData.events.map(x => ({ ...x, text: x.segs?.map(x => x.utf8)?.join(" ")?.replace(/\n/g, ' ')?.replace(/♪|'|"|\.{2,}|\<[\s\S]*?\>|\{[\s\S]*?\}|\[[\s\S]*?\]/g, '')?.trim() || '' })) }; let copyToClipboard = async langCode => { const subs = await getSubs(langCode); const text = subs.map(x => x.text).join('\n').replace(/\n{2,}/g, '\n'); await navigator.clipboard.writeText(text) }; await copyToClipboard('en') } catch (error) { alert(Ошибка: ${error.message}) } })();
Yesterday this also stopped working and gives an empty json response. Does anyone have any ideas on how to make this work in the browser again, without yt-dlp?
@alt-claymore what's the advantage of using youtube-transcript-api over yt-dlp?
@alt-claymore what's the advantage of using youtube-transcript-api over yt-dlp?
It takes more time. Sometimes when watching a video you just need to quickly get subtitles from the video to the clipboard with one click. And it worked very conveniently through the browser js
@alt-claymore https://github.com/Kakulukian/youtube-transcript/pull/46 This hasn't been committed to the main branch yet, but it worked well for me.
@naganandana-n your command doesn't work here
yt-dlp --write-sub --sub-lang en --skip-download --convert-subs vtt https://www.youtube.com/watch\?v\=REbTO_HhdLg
[youtube] Extracting URL: https://www.youtube.com/watch?v=REbTO_HhdLg
[youtube] REbTO_HhdLg: Downloading webpage
[youtube] REbTO_HhdLg: Downloading tv client config
[youtube] REbTO_HhdLg: Downloading tv player API JSON
[youtube] REbTO_HhdLg: Downloading ios player API JSON
[youtube] REbTO_HhdLg: Downloading m3u8 information
[info] REbTO_HhdLg: Downloading 1 format(s): 401+251
[info] There are no subtitles for the requested languages
[SubtitlesConvertor] There aren't any subtitles to convert
but this works
yt-dlp --write-auto-subs --sub-lang en --skip-download --convert-subs vtt https://www.youtube.com/watch\?v\=REbTO_HhdLg
[youtube] Extracting URL: https://www.youtube.com/watch?v=REbTO_HhdLg
[youtube] REbTO_HhdLg: Downloading webpage
[youtube] REbTO_HhdLg: Downloading tv client config
[youtube] REbTO_HhdLg: Downloading tv player API JSON
[youtube] REbTO_HhdLg: Downloading ios player API JSON
[youtube] REbTO_HhdLg: Downloading m3u8 information
[info] REbTO_HhdLg: Downloading subtitles: en
[info] REbTO_HhdLg: Downloading 1 format(s): 401+251
[info] Writing video subtitles to: ⚠️SMISHING e VISHING: cosa sono e come proteggersi [REbTO_HhdLg].en.vtt
[download] Destination: ⚠️SMISHING e VISHING: cosa sono e come proteggersi [REbTO_HhdLg].en.vtt
[download] 100% of 210.25KiB in 00:00:00 at 532.07KiB/s
[SubtitlesConvertor] Converting subtitles
[SubtitlesConvertor] Subtitle file for vtt is already in the requested format
This is currently working for me for version 1.0.3:
youtube_transcript_api foA4Sl_xlMc --language it --cookies ./cookies.txt --format text
just to confirm, right today this stopped to work for me too, even with cookies. so sorry for the noise but the previous workaround is not working anymore