edx-downloader icon indicating copy to clipboard operation
edx-downloader copied to clipboard

Suggested organization

Open bilel opened this issue 4 years ago • 4 comments

Hello, This script is more video oriented and it's doing what expected especially for video streams!

I tried to make small changes allowing it to organize the downloaded data, I want to use OpenEdx json responses along side this script so we could fetch Pages Metadata Appropriately. Ex: https://courses.edx.org/api/course_home/v1/outline/course-v1:IMTx+NET04x+3T2018

So here what I suggest :

  • Based on chapter names "display_name", we create the appropriate folders
  • Then we use the chapter URL "lms_web_url" to save the Chapter HTML page (Because the real boring task is to save HTML pages)
  • Then we download and save it in the chapter's folder

P.S: Another thing :) I see that you don't use youtube-dl, some edx courses use both Youtube, streams and downloadble video formats.

That would be great if you could add that in the following update. Unfortuantely, edx-dl is not maintained and it ended up with many orphan forks.

Thank you

bilel avatar Mar 06 '21 10:03 bilel

Hello @bilel, thank you for your contribution towards making this software better. I initially wrote this code for my own use where I just needed to download videos and I spent hardly a couple of hours. Now I see that community needs this. I'll spend time on this in the upcoming days and I'll consider your suggestions in order to make the next version better and complete. Thank you again.

rehmatworks avatar Mar 07 '21 09:03 rehmatworks

Hi, I just spent some time working on your Script, it helped me speed up the work. But Now, I'm not sure if my draft fits as a contribution (pull request) or a total update of the project :)

I explain with what I already did :

  • From the Course Url, we check the course_meta_url (Ex: https://courses.edx.org/api/course_home/v1/course_metadata/course-v1:IMTx+NET04x+3T2018) so we could retrieve the key : is_enrolled and based on that we follow. I think it's fatser than counting blocks...
  • Then I look for chapters, whenever I find one, I prepare a sub object of it's sequentials (lessons/pages)
  • Then I loop through the final Dataset to make folders for each chapter => Save it's lessons in their appropriate html pages

What I'm expecting to do next is: -Parsing the local copy to extract video metadata (Urls in case of youtube videos). Even if HTML source became encodes, I still think it's going to be faster. -Replacing relative paths of (assets like images...) local copy by EDX absolute paths or saving them ?

  • Optional : Changing the main input menu so the user could choose between (Full copy, Videos Only, HTML only...)

I'll send you a clean draft soon, Wish you a nice sunday afternoon

bilel avatar Mar 07 '21 16:03 bilel

@bilel Thank you so much for your contribution. This makes a lot of sense. As this is a piece of software developed very quickly, this lacks several features. Based on your suggestions, I will improve the software. If you can send in a pull request, that'd be great, otherwise, I'll spend time in the upcoming days to release a major upgrade. I'll also consider using youtube-dl to simplify the download process.

rehmatworks avatar Mar 09 '21 13:03 rehmatworks

Hi,

@bilel If you can do the PR, it would be helpful!

Thanks in advance

floviolleau avatar Jun 25 '21 23:06 floviolleau