scrape-up
scrape-up copied to clipboard
Feat: Scrape Google Scholar and automatically download relevant research papers.
Describe the feature
Feat: Scrape Google Scholar and automatically download relevant research papers.
- Saves researchers time by eliminating manual searches and downloads.
- Increases efficiency by allowing users to target specific research areas with defined keywords.
- Creates a centralized collection of research papers for easy access and reference.
How it Works:
User Input: Users enter keywords or phrases related to their research topic. Search Automation: The scraper automatically queries Google Scholar using the provided keywords. Intelligent Filtering (Optional): The system can be configured to filter results based on additional criteria (e.g., publication date, author, citation count). Download Management: The scraper retrieves and downloads the full text of relevant research papers (when available and legal and illegal both). Downloaded papers are organized in a user-defined location. File Management (Optional): The system can be configured to automatically rename and categorize downloaded papers for better organization.
Add ScreenShots
Record
- [X] I agree to follow this project's Code of Conduct
- [X] I'm a GSSoC'24 contributor
- [X] I want to work on this issue
Hi there! Thanks for opening this issue. We appreciate your contribution to this open-source project. We aim to respond or assign your issue as soon as possible.
Hello, I would like to work on this issue. Please assign it to me
Go ahead @deepashri30
Note
- Please create a separate module for this, as in the folder and project structure (if it is already created, just add your features as functions in the same module).
- Do not use the `selenium web driver as it is incompatible with all devices and cloud platforms.
- Before making any changes, please check whether the module you want to add exists. If yes, then you can add your functionality as a method only make a separate module and class for it.
All the best 👨💻
hey @deepashri30, do not add the functionality to download the papers. Just scrape the information and server them as JSON response.