edx-dl icon indicating copy to clipboard operation
edx-dl copied to clipboard

HTTP 403 forbidden

Open munipr opened this issue 4 years ago • 32 comments

I am getting the following error for past 3 days. I have the latest edx-dl and youtube-dl installed in an environment with python 3.7

edx_dl version 0.1.13 Password: Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Traceback (most recent call last): File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\runpy.py", line 85, in run_code exec(code, run_globals) File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\Scripts\edx-dl.exe_main.py", line 9, in File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\edx_dl\edx_dl.py", line 1023, in main for selected_course in selected_courses} File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\edx_dl\edx_dl.py", line 1023, in for selected_course in selected_courses} File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\edx_dl\edx_dl.py", line 184, in get_available_sections page = get_page_contents(url, headers) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\edx_dl\utils.py", line 58, in get_page_contents result = urlopen(Request(url, None, headers)) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\urllib\request.py", line 531, in open response = meth(req, response) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\urllib\request.py", line 641, in http_response 'http', request, response, code, msg, hdrs) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\urllib\request.py", line 569, in error return self._call_chain(*args) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\urllib\request.py", line 503, in _call_chain result = func(*args) File "c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\urllib\request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

munipr avatar Jul 31 '20 11:07 munipr

+1

laurion avatar Jul 31 '20 17:07 laurion

Same issue here

dorianherle avatar Aug 02 '20 09:08 dorianherle

I'm not an author of the tool, but you can fix it by changing line 425 of edx_dl.py which specifies the User-Agent attribute of the http request header. Change 'User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Mozilla/5.0', and it will work.

THolding avatar Aug 03 '20 07:08 THolding

Cool! It works @THolding

constantiux avatar Aug 03 '20 13:08 constantiux

Hi,

It is the same issue as #628.

Should we do a PR with that fix?

Kind regards

floviolleau avatar Aug 10 '20 20:08 floviolleau

Hi, the following worked to me:

  1. Change in line 425 of edx_dl.py User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Chrome/51.0.2704.103'
  2. Then follow theses step at: #595 (link)

Also, I had open the course page using the web broswer (Chrome). I dont know if these steps have any inlfuence in the process.

Zibetti avatar Aug 12 '20 19:08 Zibetti

I have tried all the changes recommended on 425 in edx_dll.py and Parser.Py. Still no luck.

chss avatar Aug 13 '20 09:08 chss

Thank you @THolding, tried your solution and it partially worked for me as well. But then at last, after downloading two modules it broke with the message 'returned non-zero exit status 1.' Any helpful hints or fixes ? Not that I'm an expert myself but @munipr , @laurion, @chss, @floviolleau & @totyped, try and put in the the name & version of the browser (in my case 'User-Agent':'Chrome/84.0.4147.105') you've your courses opened with and it should be fixed ?

DGEs2018 avatar Aug 13 '20 12:08 DGEs2018

+1 File "c:\users\user\appdata\local\programs\python\python38-32\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\user\appdata\local\programs\python\python38-32\lib\runpy.py", line 87, in run_code exec(code, run_globals) File "C:\Users\user\AppData\Local\Programs\Python\Python38-32\Scripts\edx-dl.exe_main.py", line 7, in File "c:\users\user\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 1020, in main all_selections = {selected_course: File "c:\users\user\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 1021, in get_available_sections(selected_course.url.replace('info', 'course'), File "c:\users\user\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 184, in get_available_sections page = get_page_contents(url, headers) File "c:\users\user\appdata\local\programs\python\python38-32\lib\site-packages\edx_dl\utils.py", line 58, in get_page_contents result = urlopen(Request(url, None, headers)) File "c:\users\user\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "c:\users\user\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 531, in open response = meth(req, response) File "c:\users\user\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 640, in http_response response = self.parent.error( File "c:\users\user\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 569, in error return self._call_chain(*args) File "c:\users\user\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 502, in _call_chain result = func(*args) File "c:\users\user\appdata\local\programs\python\python38-32\lib\urllib\request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

drdata2018 avatar Aug 28 '20 13:08 drdata2018

Thanks it works

drdata2018 avatar Aug 28 '20 20:08 drdata2018

I'm not an author of the tool, but you can fix it by changing line 425 of edx_dl.py which specifies the User-Agent attribute of the http request header. Change 'User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Mozilla/5.0', and it will work.

hi i am trying to run edx_dl.py to do as you have mentioned but a i run edx_dl.py by command prompt it say, this Traceback (most recent call last): File "edx_dl.py", line 33, in from ._version import version ImportError: attempted relative import with no known parent package

prayasbat avatar Aug 30 '20 09:08 prayasbat

Just looked up the reference and come up with this. The error might have to do with the version of python you're using. Inside the README.md here reads > We strongly recommend that, if you don't already have a Python interpreter installed, that you install Python >= 3.6, if possible, since it is better in general.

DGEs2018 avatar Aug 30 '20 13:08 DGEs2018

I'm not an author of the tool, but you can fix it by changing line 425 of edx_dl.py which specifies the User-Agent attribute of the http request header. Change 'User-Agent': 'edX-downloader/0.01', to 'User-Agent': 'Mozilla/5.0', and it will work.

THANK you so much. It's working fine now😃

sasidhar22 avatar Sep 02 '20 05:09 sasidhar22

Any hope for quiz or assignments?

drdata2018 avatar Sep 02 '20 12:09 drdata2018

Same issue here! Change 'User-Agent': 'edX-downloader/0.01' not working. Python 3.8.5

Extracting course information from dashboard. Traceback (most recent call last): File "c:\python38-32\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\python38-32\lib\runpy.py", line 87, in run_code exec(code, run_globals) File "C:\Python38-32\Scripts\edx-dl.exe_main.py", line 7, in File "c:\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 1020, in main all_selections = {selected_course: File "c:\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 1021, in get_available_sections(selected_course.url.replace('info', 'course'), File "c:\python38-32\lib\site-packages\edx_dl\edx_dl.py", line 184, in get_available_sections page = get_page_contents(url, headers) File "c:\python38-32\lib\site-packages\edx_dl\utils.py", line 58, in get_page_contents result = urlopen(Request(url, None, headers)) File "c:\python38-32\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "c:\python38-32\lib\urllib\request.py", line 531, in open response = meth(req, response) File "c:\python38-32\lib\urllib\request.py", line 640, in http_response response = self.parent.error( File "c:\python38-32\lib\urllib\request.py", line 569, in error return self._call_chain(*args) File "c:\python38-32\lib\urllib\request.py", line 502, in _call_chain result = func(*args) File "c:\python38-32\lib\urllib\request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

nazarialireza avatar Sep 08 '20 20:09 nazarialireza

Where can I find this edx_dl.py file in Linux?

dborwankar avatar Oct 12 '20 06:10 dborwankar

@YediPublic - I don't use Linux, so I'm not familiar but it should be under /usr/lib/python(installed version)? You might want to see if this link helps. You might have to update to the latest version of python pip pip install edx-dl

DGEs2018 avatar Oct 12 '20 09:10 DGEs2018

@YediPublic - When you got the failure message, you must have see a few lines like this on your terminal

Traceback (most recent call last): File "/usr/local/bin/edx-dl", line 10, in sys.exit(main()) File "/usr/local/lib/python3.7/site-packages/edx_dl/edx_dl.py", line 1023, in main

In the above case, the path to the file that you want to edit is /usr/local/lib/python3.7/site-packages/edx_dl/edx_dl.py

prabhakar9885 avatar Oct 14 '20 12:10 prabhakar9885

Is there any solution to this? I tried to download https://courses.edx.org/courses/course-v1:WellesleyX+Italian1x+1T2019/course/ but always get empty folders

Luciano-Delaude avatar Oct 17 '20 20:10 Luciano-Delaude

@Luciano-Delaude I just come across this link where @bi1yeu 's solution seems to have fixed the same issue for a couple of others. Had this issue myself but gonna have to give this a shot later yet

DGEs2018 avatar Oct 17 '20 22:10 DGEs2018

@Luciano-Delaude I just come across this link where @bi1yeu 's solution seems to have fixed the same issue for a couple of others. Had this issue myself but gonna have to give this a shot later yet

I tried to use that solution but it didn't worked either, I just get an empty folder with that too. If you can fix it, please let me know

Luciano-Delaude avatar Oct 19 '20 11:10 Luciano-Delaude

@Luciano-Delaude I just come across this link where @bi1yeu 's solution seems to have fixed the same issue for a couple of others. Had this issue myself but gonna have to give this a shot later yet

I tried to use that solution but it didn't worked either, I just get an empty folder with that too. If you can fix it, please let me know

Same here

anantsinha avatar Nov 03 '20 06:11 anantsinha

Exactly the same here. I've tried all solutions suggested and still no dice.

jmfontana avatar Nov 05 '20 12:11 jmfontana

Is there any solution to this? I tried to download https://courses.edx.org/courses/course-v1:WellesleyX+Italian1x+1T2019/course/ but always get empty folders

yes, I have the same problem fater using this solution

AbyssInTheMonad avatar Nov 28 '20 15:11 AbyssInTheMonad

Same issue - tried all the solutions suggested here and no luck

marianfi avatar Dec 21 '20 22:12 marianfi

Same issue - tried all the solutions suggested here and no luck

Same here, I had tried change User-Agent': 'Chrome/51.0.2704.103' (Since i usinh chrome to open edx) and return section_soup.ol#return section_soup.a['href'] But no one work for me. Appreciate it if anyone could help me

JimmyNgUNITEN avatar Dec 27 '20 06:12 JimmyNgUNITEN

Hi everyone,

Great information and thank you to the rock starts that contributed. Quick question if anyone knows please? I'm enrolled in an edX course, but it's the free version (auditing). I've got the edx-dl and python setup and after running it from the command line, it stated no downloadable content found.

Am I correct in thinking this is setup and running correctly and because I'm auditing the course (for free), that I won't be able to save any of the content?

skiextreme avatar Mar 03 '21 20:03 skiextreme

No, free auditable courses should be downloadable too (if you can play the videos in your browser). Most likely edx changed their layout that broke the downloader.

berezovskyi avatar Mar 03 '21 20:03 berezovskyi

@berezovskyi

Here's the output I got:

edx_dl version 0.1.13 Password: Building initial headers for future requests. Getting initial CSRF token. Found CSRF token. Logging into Open edX site: https://courses.edx.org/login_ajax Extracting course information from dashboard. Downloading Penetration Testing - Exploitation [course-v1:NYUx+CYB.PEN.2+1T2021/co] Downloading 0 section(s) Extracting all units information in parallel. No downloadable video found.

skiextreme avatar Mar 03 '21 20:03 skiextreme

I think I got the same error. I tried applying a few patches suggested on this thread to my local fork and gave up for the time being. Download from the website with videodownloadhelper generally works fine.

berezovskyi avatar Mar 04 '21 12:03 berezovskyi