anything-llm
anything-llm copied to clipboard
Added an option to fetch issues from gitlab. Made the file fetching a…
…synchornous to improve performance. #2334
Pull Request Type
- [x] ✨ feat
- [ ] 🐛 fix
- [x] ♻️ refactor
- [ ] 💄 style
- [ ] 🔨 chore
- [ ] 📝 docs
Relevant Issues
connect #812 resolves #2334
What is in this change?
-
New "Fetch Issues" Checkbox: Adds an option on the GitLab connector page to fetch all project issues, including associated discussion items (such as comments, assignee changes, etc.).
-
New
fetchNextPage
Method: Implements afetchNextPage
method inGitlabRepoLoader
to streamline the process of fetching all pages from an API endpoint in a more generic and reusable way. -
Refactoring: Refactors the
getRepoBranches
andfetchFilesRecursive
methods to utilize the newfetchNextPage
logic. -
Speed Improvements: File fetching is now performed in parallel, resulting in a substantial performance boost—improving speed by an order of magnitude.
Additional Information
There are a few areas that would benefit from further discussion:
- EDIT: no longer current. The issues are converted to markdown now.
- Page Size Configuration
- Concurrent fetching significantly boosts performance, but it can strain system resources, particularly if GitLab is hosted on a less powerful server.
- During testing on a GitLab instance with 8 cores and 16GB of RAM, I fetched a repository containing 6.5k files and 1.7k issues (with up to 150 discussion items each) using 100 items per page. While this worked well, the server's average 5-minute load reached 10.
- It might be worth considering making the pageSize parameter configurable to allow for smaller page sizes (e.g., 25 items per page) on less capable servers.
- Chunk Source & Repository URL
- Currently, the generateChunkSource function does not include the repository URL in its payload. This might be necessary for the "Automatic Document Content Sync" feature, particularly for self-hosted GitLab instances.
- If the repo URL is indeed required for this feature to work, I am happy to open an issue and submit a separate PR to address this.
Developer Validations
- [x] I ran
yarn lint
from the root of the repo & committed changes - [x] Relevant documentation has been updated
- [x] I have tested my code functionality
- [x] Docker build succeeds locally