coursera-dl icon indicating copy to clipboard operation
coursera-dl copied to clipboard

New feature - PDFs of quizes and other materials

Open Smitty010 opened this issue 13 years ago • 6 comments

This is clearly a new feature (that would be nice I think). Currently, to keep copies of the quizes, I go in with Chrome and then print with "save to PDF". I do the same with the class syllabus, and other materials. It would be nice if this could be automated in this program. I know someone that did a version of this in a different python coursera downloader and added functionality using wkhtmltopdf to convert html to pdf format. They would find the quizes, download them as html files and then do the conversion. Unfortunately, I found that wkhtmltopdf blew up (threw an exception) on my windows box. It would be nice if it would also pdf the syllabus, etc. One last thing to point out (should you decide to do this), the announcements (aka "home") page typically changes at least once per week, so it might be good to recreate it every time.

Smitty010 avatar Mar 06 '13 23:03 Smitty010

+1

nathanleiby avatar Mar 07 '13 01:03 nathanleiby

Here's an example of other materials that might be useful (with examples from progfun-002):

  • Assignment index page (e.g.), linking to
    • assignment instructions (e.g.)(for HTML)
    • assignment project (e.g.) (zip of code framework)

I don't know how generic this scheme is.

eddsteel avatar May 17 '13 18:05 eddsteel

+1 This would be helpful.

firesofmay avatar Jun 17 '13 15:06 firesofmay

I don't think that changing the info to PDF would be a good idea, maybe keeping the original html, would be better.

The main issue of downloading the extra materials is that each course defines their own sections. Generally they do it through an internal wiki that produces links like:

https://class.coursera.org/<COURSE_NAME>/wiki/view?page=Schedule

(e.g. for the schedule, course information, faq, course logistics, etc):

I think that parsing and getting such pages must not be so difficult. As well as downloading the standard sections that almost all courses have (that have an almost regular structure):

/quiz/index -> quizzes /assignment/index -> auto graded assignments /human_grading/index -> peer graded assignments /quiz/index?quiz_type=homework -> homework questions (not in all courses) /quiz/index?quiz_type=exam -> exam (not in all courses)

Another issue is that many times those pages have links to external ressources such as .zip files or .pdf and those are sometimes hosted in other places. The question would be where to stop, and what's the best way to do such crawler without reinventing the wheel or hand coding everything, to have in the end a nice 'browsable' site like with (wget -p, or httrack).

iemejia avatar Jun 17 '13 20:06 iemejia

I recently discovered that some courses do not (or no longer) make their materials available indefinitely after the course and lost important reference material as a result. I have turned to coursera-dl for downloading, and am manually grabbing what it does not.

Just the ability to grab the surrounding coursera web(wiki?) pages (syllabus, course materials,etc) would be a great next step. I agree that HTML is better than PDF.

Following all the links on the coursera pages to grab the linked PDFs, Word docs, YouTube videos, and everything would be a really cool future goal, but probably not as important. In many cases a user may not want to download all of those external resources and may prefer to follow the links from the downloaded coursera page.

adamvoss avatar Feb 15 '15 23:02 adamvoss

Chrome: https://chrome.google.com/webstore/detail/coursera-quiz-printer/pkgbcmdpjlnmngdfjicnkppkkmnaejnm

Firefox: https://addons.mozilla.org/addon/coursera-quiz-printer/

Blogpost: https://churchofthought.org/blog/2020/10/17/coursera-quiz-printer-a-cross-browser-webextension/

churchofthought avatar Oct 26 '20 10:10 churchofthought