crunchy-xml-decoder
crunchy-xml-decoder copied to clipboard
Some videos can't be downloaded because they are using HLS (m3u8 file)
It seems that some videos are using HLS (m3u8 file) instead of rtmp which results in an error when trying to download them.
I had this problem with these two videos: http://www.crunchyroll.com/jojos-bizarre-adventure/episode-13-wheel-of-fortune-652601 http://www.crunchyroll.com/jojos-bizarre-adventure/episode-1-part-1-phantom-blood-653409
I think this can only be solved with something like FFmpeg. (Has the best quality when converting from m3u8 to mp4 and mkvmerge should support mp4 too.)
Do you know how to get the m3u8 link by using the media_id? Can't find the right function or API-Link. (I 've searched the swfPlayer code, but can't find where the video_id is retrieved.)
Sorry for my bad English.
Strange, I wanted to look at the code for the HLS streams again but today the Player has not loaded the HLS Plugin http://static.ak.crunchyroll.com/versioned_assets/OSMFHLSPlugin.89701c06.swf Instead it used the default rtmp Stream. (I found 26 HLS-Streams yesterday, but today non) Perhaps Crunchyroll is testing a new hls streaming server. Maybe a new player without Flash?? (HLS Server URL: http://serve.cxcdn.net)
A proof of concept patch is below. I have a lot of local changes in my version, so it might not apply cleanly. It will also need m3u8 from pypi and py-Crypto.
I'm in the process of rewriting from the ground up, but it seems like the m3u8 urls can be found using the mobile api. Also had luck with bruteforcing video ids.
The patch above works, I'm still fine tuning the exception handling since the HLS likes to kill the connections whenever it believes the client is not fast enough.
Hi einstein95, What do you mean by bruteforcing video ids?
I checkt the code of the hls flash plugin and the Player sends Postrequest to
http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id=652601&video_format=103&video_quality=80&auto_play=1&aff=crunchyroll-website&show_pop_out_controls=1&pop_out_disable_message=&click_through=0
Body: current%5Fpage=http%3A%2F%2Fwww%2Ecrunchyroll%2Ecom%2Fjojos%2Dbizarre%2Dadventure%2Fepisode%2D13%2Dwheel%2Dof%2Dfortune%2D652601
Cookie: Host:www.crunchyroll.com', 'Connection: keep-alive', 'Content-Length: 128', 'Origin:http://static.ak.crunchyroll.com', 'X-Requested-With:ShockwaveFlash/22.0.0.209', 'User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML', 'like Gecko) Chrome/52.0.2743.116 Safari/537.36', 'Content-Type:application/x-www-form-urlencoded', 'Accept:/', 'Referer:http://static.ak.crunchyroll.com/versioned_assets/StandardVideoPlayer.f3770232.swf', 'deflate', 'Accept-Language:de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4', 'Cookie:__cfduid=??; __qca=??; _ig=??; _ig_nump=??; _ig_sess=??; _igur=-1; c_locale=deDE; c_visitor=??; position=1; userid=??; c_userid=??; c_userkey=??; c_d=p%3D1; sess_id=??; _ga=GA1.2.1831795990.1459009773; _gat=1; ki_t=??; ki_r=; __ar_v4=??
which returns an xml with the m3u8 in it. (jaJP.m3u8) and Inside the .m3u8 is the link to the Computer stream (stream.m3u8)
Sorry for my bad english
Also this request works with hls and rtmpe
hls stream link is in:
<stream_info>
And hls stream in:
<stream_info>
<stream_info> <host>rtmpe://cp150757.edgefcs.net/ondemand/?auth=daEaacsaGaobgaIbWdQdWc5bQd6dMcxdNbZ-bxUgqx-dHa-lCLwnqLBDuy&aifp=0009&slist=c20/s/ve2307557/video.mp4</host> </stream_info>
<stream_info> <host></host> <file>http://serve.cxcdn.net/s/v/9vuxan8yjjaks8f/m/8d969e75710f3008f6d529581a5e9dc0/jaJP.m3u8?v=4a6e59e8fe831dcbf91a170acd0092e9&k=aGtnK2VrUnV0K21sZXVIaURjdmNWTzZWRzdzPV97ImEiOiI5MSw2LGphSlAsIiwiYyI6MTQ0MDI2ODkwNiwiZCI6ImNyYW5pbWUiLCJnIjoiWloiLCJoIjoiOXZ1eGFuOHlqamFrczhmIiwibCI6NzIwMCwicCI6IjEiLCJyIjoiYzMwZDgyIiwicyI6MTgwMjA5LCJ0IjoxNDcxMjc4NTc0LCJ2IjozfQ</file>
Sorry I forgot "Insert code"
What do you mean by bruteforcing video ids?
for i in {710000..715151}; do curl -s 'http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id='$i --data 'current%5Fpage=http%3A%2F%2Fwww%2Ecrunchyroll%2Ecom' -H 'Cookie: sess_id=; c_userid=; c_userkey=' | grep -P '<media_type>(\d)</media_type>' && echo $i; done
The current meta data parsing seems to work perfectly fine? You don't get a host key, but a file entry. If you follow the file link, you arrive at the m3u8 list.
Or do you not start from the video URL?
What dose the Inforation <media_type>?? mean?? Becouse by me Its 1 on hls and on rtmpe
Can you tell me a link where media_type is something different then 1 ?
@iCertys http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id=710001
Yes but when I check the id=710001 there is no stream behind this And if I check it in the browser it redirect me to 680655 In which country did you tested this id?
What do you mean by "there is no stream behind this"?
If I send a Postrequest to this ID I get a 500 Internal server error And If I Open the Id With the Browser I get redirect to 680655 And If I Open the API link whithout Postrequest I get the Link http://www.crunchyroll.com/media-710001/-unbekannt
-unbekannt is German and means unknownbut I use a US Proxy from California (I dont know why it is German written in German)
That is why I asked In which country you tested this becouse not all steams ar accessabel in every country.
I tested California and German
If you get redirected to 680655 and that video loads, then it should work.
Yes but Im searching for a video where the media_type is not 1 And on the redirected video it is 1 Thats my proplem I cant find a video where it is not 1 ;(
If you use the media ID 710001, the resulting GetStandardConfig gives media_type 4.
Ok Interesting but i get still error 500. Anyway I do not want to waste all your time. Thank you.
Work
add
I have no idea why it works.
i don't know why but it works
To begin with, the script stops running because the host variable is set to None, the host tag is actually empty in the xml file when it comes to HLS. By calling host.string
you're pushing the script to commit an AttributeError. If the host variable is set to None, then you can't get its attributes which is logic. Knowing that this kind of exception is handled by the script, you'll execute the code that comes right after except AttributeError:
.
You don't have to print it actually, you just have to call it, anything else would work.
Just a suggestion but would you be able to base your new code off of the one in youtube-dl? here This could help with videos but not the manga. While at restructuring the code it would be nice to integrate the pip installer as well.
@insanedude63 The code should be pretty much self-contained. It only cares about the file property from the video meta data.
Please test #78.
I have add line print host.string and got this error:
@rs3mk thank you thank you that fixed it im acually using an old ver of the toolkit that i modified for my own purposes to get the ep title and desc, auto login, url dl que, etc.
so i did not what to have to start over with the new ver well your fix fixed it thank you
@adriannx
is the tabs lined up
try:\n\thost = xmlconfig.find('host').string\n\tprint host.string
are they spaces instead of tabs
they must all be one or the other