edx-dl icon indicating copy to clipboard operation
edx-dl copied to clipboard

TypeError: 'NoneType' object is not subscriptable

Open MissGorgeousTech opened this issue 4 years ago • 16 comments

Subject of the issue

when trying to download the course videos specifically one course (listed bellow), it gives the error TypeError: 'NoneType' object is not subscriptable. Tried with others and doesn't give errors and works fine.

Traceback (most recent call last): File "/usr/local/bin/edx-dl", line 11, in sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/edx_dl/edx_dl.py", line 1023, in main for selected_course in selected_courses} File "/usr/local/lib/python3.6/dist-packages/edx_dl/edx_dl.py", line 1023, in for selected_course in selected_courses} File "/usr/local/lib/python3.6/dist-packages/edx_dl/edx_dl.py", line 186, in get_available_sections sections = page_extractor.extract_sections_from_html(page, BASE_URL) File "/usr/local/lib/python3.6/dist-packages/edx_dl/parsing.py", line 403, in extract_sections_from_html for i, section_soup in enumerate(sections_soup, 1)] File "/usr/local/lib/python3.6/dist-packages/edx_dl/parsing.py", line 403, in for i, section_soup in enumerate(sections_soup, 1)] File "/usr/local/lib/python3.6/dist-packages/edx_dl/parsing.py", line 372, in _make_url return section_soup.a['href']

environment

  • Operating System (name/version):
  • Python version: 3.6
  • youtube-dl version: 2020.03.08
  • edx-dl version: 0.1.13

Steps to reproduce

https://courses.edx.org/courses/course-v1:IBM+PY0101EN+1T2020/cou rse/

MissGorgeousTech avatar Mar 23 '20 18:03 MissGorgeousTech

and with this: https://courses.edx.org/courses/course-v1:MITx+6.004.1x_3+3T2016/course/

numlockkey avatar Mar 27 '20 01:03 numlockkey

and with this: https://courses.edx.org/courses/course-v1:StanfordOnline+CSX0005+1T2020/course/

tigerjoy avatar Mar 31 '20 05:03 tigerjoy

I get this fixed by changing line 372 code in parsing.py. From 'return section_soup.a['href']' to 'return section_soup.ol'

aprilchew avatar Apr 02 '20 03:04 aprilchew

aprilchew: Tried it, doesn't work.

numlockkey avatar Apr 02 '20 12:04 numlockkey

aprilchew: Tried it, doesn't work.

Try section_soup.ol, remove the ['href']

aprilchew avatar Apr 04 '20 01:04 aprilchew

@aprilchew Thank you very much. It does indeed fix the issue.

For those who are still having trouble, here are the steps that you can follow.

  1. Clone or download as .zip https://github.com/coursera-dl/edx-dl
  2. Extract the .zip using "Extract Here" option.
  3. Navigate to the following folder edx-dl-master/edx_dl
  4. Open parsing.py with your favorite text editor that displays line numbers.
  5. Scroll down to line 372, and change return section_soup.a['href'] to return section_soup.ol

Here is the before and after Image for reference The commented line 372 shows the before, and the 373 line is the change.

  1. Go up a directory, inside edx-dl-master
  2. To download courses now, you must use the following: - python edx-dl.py -u [email protected] COURSE_URL

NOTE: If you have downloaded edx-dl using pip, the following steps won't work. To make it work you need to navigate to site-packages or dist-packages folder, find the edx-dl folder, look for parser.py and make the necessary changes as above.

EDIT: I've downloaded a few other courses as well, and this change has not yet broken any other downloads so far.

tigerjoy avatar Apr 04 '20 05:04 tigerjoy

Hi,

A PR would be appreciated :)

Kind regards

floviolleau avatar Apr 18 '20 18:04 floviolleau

Tigerjoy solution worked for me. However, be careful and not create another line, I just replaced the original code.

dr-jeffrey avatar May 10 '20 06:05 dr-jeffrey

Hello smart guys. is there no one available in github who is able to fix the problems of downloading tutorials sucessfully from Edx website?. I have tried since 2019 to use this script to download my tutorials from Edx and it only stops after displaying my course contains. For me its really a pain because i have courses i desperately needed offline which have expired and i am still learning to code and not experiened to help in solving the downloading problems. Thanks

ichit avatar May 10 '20 09:05 ichit

I can work on it, will send a pr soon. @ichit this is not the way you should be asking people to contribute. Be kind and respectful.

Ankk98 avatar May 21 '20 22:05 Ankk98

@Ankk98, a pull request that closes this would be welcome. Again: the simpler (and cleaner) the code, the better (since it will ease maintenance in the future when things break again--and they will).

rbrito avatar May 22 '20 21:05 rbrito

@Ankk98 I do not mean to disrespect anyone or speak rudely. I quite understand perfectly that no one get paid for their work on this platform. I mistakenly showed my frustration due to my inability to download a course i desperately need for my thesis. I apologize for to anyone who feels offended. I thank all persons who helps to make life easier for others.

ichit avatar May 29 '20 19:05 ichit

Hi,

I faced to the issue on a course today. So I decided to do a PR...

Here is the PR on the table.

Anyone know who are the owner(s) of this project? I see lots of PRs pending merge to master.

If anyone can help here it will nice :) Maybe @rbrito?

Thanks

floviolleau avatar Jul 08 '20 22:07 floviolleau

@aprilchew Thank you very much. It does indeed fix the issue.

For those who are still having trouble, here are the steps that you can follow.

1. Clone or download as .zip **https://github.com/coursera-dl/edx-dl**

2. Extract the .zip using **"Extract** Here" option.

3. Navigate to the following folder **edx-dl-master/edx_dl**

4. Open **parsing.py** with your favorite text editor that displays line numbers.

5. Scroll down to line 372, and change **return section_soup.a['href']** to **return section_soup.ol**

Here is the before and after Image for reference The commented line 372 shows the before, and the 373 line is the change.

1. Go up a directory, inside **edx-dl-master**

2. To download courses now, you must use the following: -
   `python edx-dl.py -u [email protected] COURSE_URL`

NOTE: If you have downloaded edx-dl using pip, the following steps won't work. To make it work you need to navigate to site-packages or dist-packages folder, find the edx-dl folder, look for parser.py and make the necessary changes as above.

EDIT: I've downloaded a few other courses as well, and this change has not yet broken any other downloads so far.

Hi, I have followed the whole procedure according to you but there has to be an empty folder created. what should I do next step? please suggest.

Bucky0789 avatar Oct 22 '20 22:10 Bucky0789

@tigerjoy

For those who are still having trouble, here are the steps that you can follow.

  1. Clone or download as .zip https://github.com/coursera-dl/edx-dl

  2. Extract the .zip using "Extract Here" option.

  3. Navigate to the following folder edx-dl-master/edx_dl

  4. Open parsing.py with your favorite text editor that displays line numbers.

  5. Scroll down to line 372, and change return section_soup.a['href'] to return section_soup.ol

Here is the before and after Image for reference The commented line 372 shows the before, and the 373 line is the change.

  1. Go up a directory, inside edx-dl-master

  2. To download courses now, you must use the following: - python edx-dl.py -u [email protected] COURSE_URL

NOTE: If you have downloaded edx-dl using pip, the following steps won't work. To make it work you need to navigate to site-packages or dist-packages folder, find the edx-dl folder, look for parser.py and make the necessary changes as above.

EDIT: I've downloaded a few other courses as well, and this change has not yet broken any other downloads so far.

Hi, I have followed the whole procedure according to you but there has to be an empty folder created. what should I do next step? please suggest.

Bucky0789 avatar Oct 22 '20 22:10 Bucky0789