anything-llm
anything-llm copied to clipboard
[FEAT] Website depth scraping data connector
Pull Request Type
- [x] ✨ feat
- [ ] 🐛 fix
- [ ] ♻️ refactor
- [ ] 💄 style
- [ ] 🔨 chore
- [ ] 📝 docs
Relevant Issues
resolves #1190
What is in this change?
- Create data connector that will scrape to X depth of links on site
- Only finds links with matching domain name on site to scrape only links that are on the same website
Additional Information
Developer Validations
- [x] I ran
yarn lintfrom the root of the repo & committed changes - [x] Relevant documentation has been updated
- [x] I have tested my code functionality
- [x] Docker build succeeds locally
@timothycarambat, refactored based on what we discussed.
- Creates array of all links so we know how many links before main scraping starts
- Passes the array to bulk scraping function