crunchy-xml-decoder icon indicating copy to clipboard operation
crunchy-xml-decoder copied to clipboard

Some videos can't be downloaded because they are using HLS (m3u8 file)

Open Berni11 opened this issue 8 years ago • 30 comments

It seems that some videos are using HLS (m3u8 file) instead of rtmp which results in an error when trying to download them.

I had this problem with these two videos: http://www.crunchyroll.com/jojos-bizarre-adventure/episode-13-wheel-of-fortune-652601 http://www.crunchyroll.com/jojos-bizarre-adventure/episode-1-part-1-phantom-blood-653409

Berni11 avatar Aug 06 '16 23:08 Berni11

I think this can only be solved with something like FFmpeg. (Has the best quality when converting from m3u8 to mp4 and mkvmerge should support mp4 too.)

Do you know how to get the m3u8 link by using the media_id? Can't find the right function or API-Link. (I 've searched the swfPlayer code, but can't find where the video_id is retrieved.)

Sorry for my bad English.

iCertys avatar Aug 07 '16 02:08 iCertys

Strange, I wanted to look at the code for the HLS streams again but today the Player has not loaded the HLS Plugin http://static.ak.crunchyroll.com/versioned_assets/OSMFHLSPlugin.89701c06.swf Instead it used the default rtmp Stream. (I found 26 HLS-Streams yesterday, but today non) Perhaps Crunchyroll is testing a new hls streaming server. Maybe a new player without Flash?? (HLS Server URL: http://serve.cxcdn.net)

iCertys avatar Aug 07 '16 22:08 iCertys

A proof of concept patch is below. I have a lot of local changes in my version, so it might not apply cleanly. It will also need m3u8 from pypi and py-Crypto.

crunchy-xml-decoder-xml.diff.txt

jsonn avatar Aug 19 '16 22:08 jsonn

I'm in the process of rewriting from the ground up, but it seems like the m3u8 urls can be found using the mobile api. Also had luck with bruteforcing video ids.

einstein95 avatar Aug 20 '16 13:08 einstein95

The patch above works, I'm still fine tuning the exception handling since the HLS likes to kill the connections whenever it believes the client is not fast enough.

jsonn avatar Aug 20 '16 13:08 jsonn

Hi einstein95, What do you mean by bruteforcing video ids?

I checkt the code of the hls flash plugin and the Player sends Postrequest to

http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id=652601&video_format=103&video_quality=80&auto_play=1&aff=crunchyroll-website&show_pop_out_controls=1&pop_out_disable_message=&click_through=0

Body: current%5Fpage=http%3A%2F%2Fwww%2Ecrunchyroll%2Ecom%2Fjojos%2Dbizarre%2Dadventure%2Fepisode%2D13%2Dwheel%2Dof%2Dfortune%2D652601

Cookie: Host:www.crunchyroll.com', 'Connection: keep-alive', 'Content-Length: 128', 'Origin:http://static.ak.crunchyroll.com', 'X-Requested-With:ShockwaveFlash/22.0.0.209', 'User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML', 'like Gecko) Chrome/52.0.2743.116 Safari/537.36', 'Content-Type:application/x-www-form-urlencoded', 'Accept:/', 'Referer:http://static.ak.crunchyroll.com/versioned_assets/StandardVideoPlayer.f3770232.swf', 'deflate', 'Accept-Language:de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4', 'Cookie:__cfduid=??; __qca=??; _ig=??; _ig_nump=??; _ig_sess=??; _igur=-1; c_locale=deDE; c_visitor=??; position=1; userid=??; c_userid=??; c_userkey=??; c_d=p%3D1; sess_id=??; _ga=GA1.2.1831795990.1459009773; _gat=1; ki_t=??; ki_r=; __ar_v4=??

which returns an xml with the m3u8 in it. (jaJP.m3u8) and Inside the .m3u8 is the link to the Computer stream (stream.m3u8)

Sorry for my bad english

iCertys avatar Aug 20 '16 14:08 iCertys

Also this request works with hls and rtmpe

hls stream link is in: <stream_info> rtmpe://cp150757.edgefcs.net/ondemand/?auth=daEaacsaGaobgaIbWdQdWc5bQd6dMcxdNbZ-bxUgqx-dHa-lCLwnqLBDuy&aifp=0009&slist=c20/s/ve2307557/video.mp4 </stream_info>

And hls stream in: <stream_info> http://serve.cxcdn.net/s/v/9vuxan8yjjaks8f/m/8d969e75710f3008f6d529581a5e9dc0/jaJP.m3u8?v=4a6e59e8fe831dcbf91a170acd0092e9&k=aGtnK2VrUnV0K21sZXVIaURjdmNWTzZWRzdzPV97ImEiOiI5MSw2LGphSlAsIiwiYyI6MTQ0MDI2ODkwNiwiZCI6ImNyYW5pbWUiLCJnIjoiWloiLCJoIjoiOXZ1eGFuOHlqamFrczhmIiwibCI6NzIwMCwicCI6IjEiLCJyIjoiYzMwZDgyIiwicyI6MTgwMjA5LCJ0IjoxNDcxMjc4NTc0LCJ2IjozfQ </stream_info>

iCertys avatar Aug 20 '16 14:08 iCertys

<stream_info> <host>rtmpe://cp150757.edgefcs.net/ondemand/?auth=daEaacsaGaobgaIbWdQdWc5bQd6dMcxdNbZ-bxUgqx-dHa-lCLwnqLBDuy&amp;aifp=0009&amp;slist=c20/s/ve2307557/video.mp4</host> </stream_info>

iCertys avatar Aug 20 '16 14:08 iCertys

<stream_info> <host></host> <file>http://serve.cxcdn.net/s/v/9vuxan8yjjaks8f/m/8d969e75710f3008f6d529581a5e9dc0/jaJP.m3u8?v=4a6e59e8fe831dcbf91a170acd0092e9&amp;k=aGtnK2VrUnV0K21sZXVIaURjdmNWTzZWRzdzPV97ImEiOiI5MSw2LGphSlAsIiwiYyI6MTQ0MDI2ODkwNiwiZCI6ImNyYW5pbWUiLCJnIjoiWloiLCJoIjoiOXZ1eGFuOHlqamFrczhmIiwibCI6NzIwMCwicCI6IjEiLCJyIjoiYzMwZDgyIiwicyI6MTgwMjA5LCJ0IjoxNDcxMjc4NTc0LCJ2IjozfQ</file>

iCertys avatar Aug 20 '16 14:08 iCertys

Sorry I forgot "Insert code"

iCertys avatar Aug 20 '16 14:08 iCertys

What do you mean by bruteforcing video ids?

for i in {710000..715151}; do curl -s 'http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id='$i --data 'current%5Fpage=http%3A%2F%2Fwww%2Ecrunchyroll%2Ecom' -H 'Cookie: sess_id=; c_userid=; c_userkey=' | grep -P '<media_type>(\d)</media_type>' && echo $i; done

einstein95 avatar Aug 20 '16 14:08 einstein95

The current meta data parsing seems to work perfectly fine? You don't get a host key, but a file entry. If you follow the file link, you arrive at the m3u8 list.

jsonn avatar Aug 20 '16 14:08 jsonn

Or do you not start from the video URL?

jsonn avatar Aug 20 '16 14:08 jsonn

What dose the Inforation <media_type>?? mean?? Becouse by me Its 1 on hls and on rtmpe

Can you tell me a link where media_type is something different then 1 ?

iCertys avatar Aug 20 '16 14:08 iCertys

@iCertys http://www.crunchyroll.com/xml/?req=RpcApiVideoPlayer_GetStandardConfig&media_id=710001

einstein95 avatar Aug 20 '16 14:08 einstein95

Yes but when I check the id=710001 there is no stream behind this And if I check it in the browser it redirect me to 680655 In which country did you tested this id?

iCertys avatar Aug 20 '16 15:08 iCertys

What do you mean by "there is no stream behind this"?

einstein95 avatar Aug 20 '16 15:08 einstein95

If I send a Postrequest to this ID I get a 500 Internal server error And If I Open the Id With the Browser I get redirect to 680655 And If I Open the API link whithout Postrequest I get the Link http://www.crunchyroll.com/media-710001/-unbekannt

-unbekannt is German and means unknownbut I use a US Proxy from California (I dont know why it is German written in German)

That is why I asked In which country you tested this becouse not all steams ar accessabel in every country.

I tested California and German

iCertys avatar Aug 20 '16 16:08 iCertys

If you get redirected to 680655 and that video loads, then it should work.

einstein95 avatar Aug 20 '16 16:08 einstein95

Yes but Im searching for a video where the media_type is not 1 And on the redirected video it is 1 Thats my proplem I cant find a video where it is not 1 ;(

iCertys avatar Aug 20 '16 16:08 iCertys

If you use the media ID 710001, the resulting GetStandardConfig gives media_type 4.

einstein95 avatar Aug 20 '16 16:08 einstein95

Ok Interesting but i get still error 500. Anyway I do not want to waste all your time. Thank you.

iCertys avatar Aug 20 '16 16:08 iCertys

Work

sin titulo 8

add

sin titulo 9

I have no idea why it works.

rs3mk avatar Aug 20 '16 19:08 rs3mk

i don't know why but it works

rcyclope avatar Aug 20 '16 22:08 rcyclope

To begin with, the script stops running because the host variable is set to None, the host tag is actually empty in the xml file when it comes to HLS. By calling host.string you're pushing the script to commit an AttributeError. If the host variable is set to None, then you can't get its attributes which is logic. Knowing that this kind of exception is handled by the script, you'll execute the code that comes right after except AttributeError:. You don't have to print it actually, you just have to call it, anything else would work.

ObiWanTwo avatar Aug 21 '16 14:08 ObiWanTwo

Just a suggestion but would you be able to base your new code off of the one in youtube-dl? here This could help with videos but not the manga. While at restructuring the code it would be nice to integrate the pip installer as well.

insanedude63 avatar Sep 02 '16 20:09 insanedude63

@insanedude63 The code should be pretty much self-contained. It only cares about the file property from the video meta data.

jsonn avatar Sep 02 '16 20:09 jsonn

Please test #78.

jsonn avatar Sep 15 '16 16:09 jsonn

I have add line print host.string and got this error: image

VladimiPutin avatar Sep 18 '16 02:09 VladimiPutin

@rs3mk thank you thank you that fixed it im acually using an old ver of the toolkit that i modified for my own purposes to get the ep title and desc, auto login, url dl que, etc.

so i did not what to have to start over with the new ver well your fix fixed it thank you

@adriannx is the tabs lined up try:\n\thost = xmlconfig.find('host').string\n\tprint host.string are they spaces instead of tabs they must all be one or the other

Pikanet128 avatar Sep 23 '16 06:09 Pikanet128