colrev
colrev copied to clipboard
feat: GitHub SearchSource
Description
Integrate GitHub as a SearchSource within the CoLRev environment, enabling the search and prep operations to utilize GitHub's vast repository of code and documentation. This feature will allow users to search GitHub repositories by title and README content, enhancing the scope of CoLRev's research and analysis capabilities.
Preferred Solution
- Develop a New SearchSource for GitHub: This SearchSource should leverage the GitHub REST API to perform searches based on repository titles and README contents.
- Implement a Prep Package for GitHub: Similar to the approach taken with the crossref-prep-metadata package, this functionality should retrieve relevant metadata from GitHub repositories, such as the repository's description, topics, license, and possibly citation information, when a
github.comURL is provided.
Links for Reference and Development
- GitHub REST API Documentation
- Tutorial on Using GitHub API with Python
- PyGithub: A Python library to access the GitHub REST API
- Citation File Format on GitHub for understanding how repositories might provide citation metadata.
User Story
- A user initializes CoLRev with
colrev init. - To perform a search, the user runs
colrev search -a colrev.github, specifying search parameters that include repository titles and keywords found in README files. - CoLRev utilizes the GitHub SearchSource to query the GitHub API, retrieving a list of repositories that match the search criteria.
- Search results, including repository metadata and links to README files, are then stored in a designated file or database.
- When a user has a list of GitHub repository URLs, they can use the prep operation to enrich the collected data with additional metadata from each repository, enhancing the quality and usefulness of the search results.
Expected Effort
- Duration: 2 months
- Team Requirement: 3-4 individuals
i'd like to contribute to this issue
I would like to contribute to this issue
I would like to contribute to this issue.
I would like to contribute to this issue.
I want to contribute to this issue.
We currently have 5 people interested in a topic for 3-4. @JohannesDiel : can I ask you to join #360 with @pmao0907 and @MingxinJiang ?
Yes, no problem. Should I just comment in #360 and delete my comment here?
Thank you @JohannesDiel for joining the other group. No need to delete comments.
This means we have a group of 4: @edensarrival , @koljarinne , @U1TIM4T3 , and @k-schnickmann :+1: Please go ahead, select a group lead, fork the repository and link your repository in this feed.
https://github.com/edensarrival/colrev_SS24