ilc-scraper
ilc-scraper copied to clipboard
A scraper for Impartus Lecture Capture videos for BITS Hyderabad
Impartus Scraper
This is a scraper for the Impartus Lecture Capture system deployed in BITS Pilani Hyderabad Campus. No more crying when Impartus servers crash during exam time because of overload. You can even access lectures from previous years in case your current prof sucks :P
Features
- Awesome GUI!

- Download lectures without subscribing to them.
- Download BOTH the camera views. (Saved as video tracks in output file)
- Specify ranges to limit the lectures to be downloaded.
- Fuzzy searching of course names.
- Skips already downloaded lectures.
- Multiprocessing support.
Installation
Recommended:
Go to the Releases page and download the latest version according to your OS. Just double click and run!
Manual method:
- Ensure you have Python 3.7 or higher installed, and available in your system
PATH. To check, runpython --versionfrom command line/terminal. - Install
ffmpegfrom here. Ensure it is in your PATH variable. To check, runffmpeg -versionfrom command line/terminal. - Install poetry using
pip install --user poetry. Restart your PC. - Clone this repo if you know git, or click
Download as ZIPand extract it to some location. - Open terminal and cd to download/clone directory.
- Run
poetry install --no-dev -E guifor the default installation. (If you don't want the GUI, which takes up a lot of extra space, omit the-E guifrom the command.)
Configuration (Optional)
In case you plan to use the script multiple times, it'll be useful to not have to type the Impartus creds and download location everytime. You can store these for future use.
- Duplicate the
sample_config.jsonfile and rename it toimp_config.json. - Enter your Impartus credentials (Username, Password) in the fields. Required to be able to access Impartus.
- Specify the Download Location for the lectures in the
save_foldfield. If you leave this blank, it will default toImpartus Lecturesfolder inside the download location of this script.
IMP: For Windows, you need to use double \\ in paths. Eg: C:\\Users\\Student\\Downloads\\Lectures or D:\\Impartus.
Running (manual method)
- GUI: Run
poetry run python ilc_scrape.pyand enter the details as asked (See Help section). For the first time, you will need to specify the full URL of the course. For future uses, you can select the course names from the dropdown. - Command line: Pass argument flags to the script. Eg:
poetry run python ilc_scrape.py -n "operating systems" -r 14:16,25 -fwill force download lecture numbers 14, 15 and 25 of OS lectures. Seepython ilc_scrape.py -hto see all available options. If you have the GUI installed, but still wanna use the command line, add--ignore-gooeyto the command at the end. This will disable the GUI.
How it works
-
The script simply exploits the fact that the Impartus API for retrieving the lecture info for a course does not check if the user is subscribed to it. Then its a simple task to loop over the lectures and download their video streams.
-
All the videos are available as HLS Streams, which are downloaded using
ffmpeg. -
The fuzzy search for matching the course names works using
difflibmodule. For the first time when the course URL is specified, it is added to a local database, along with its full name. These full names are then compared against the given query and the most similar full name is chosen. -
The GUI is made using Gooey, which makes it dead simple to convert CLI Python apps to full featured GUIs.
Help
Fixing URL unrecognized errors
-
For the course URL, you should specify a url of the course:
http://172.16.3.20/ilc/#/course/12345/678. It will give an error if you write the url of a specific lecture. -
If you are outside campus premises, you should replace
172.16.3.20with eithera.impartus.comorbitshyd.impartus.com, with everything else being the same. -
The url will be saved, and you can use dropdown in the future.
Specifying Ranges
The range pattern follows the format of Python Slices.
12 (Only 12 will be downloaded),
1:4 (1 included, 4 excluded),
:10 (Download lecture numbers 1 to 9),
3: (Download all lectures from number 3 onwards).
You can also specify multiple ranges using commas.
12, 4:6, 15:, :2 will download lectures 1, 4, 5, 12, 15, 16, 17, ... until the latest one.
Contributing
See CONTRIBUTING.md.