anything-llm icon indicating copy to clipboard operation
anything-llm copied to clipboard

[FEAT] Website depth scraping data connector

Open shatfield4 opened this issue 1 year ago • 1 comments

Pull Request Type

  • [x] ✨ feat
  • [ ] 🐛 fix
  • [ ] ♻️ refactor
  • [ ] 💄 style
  • [ ] 🔨 chore
  • [ ] 📝 docs

Relevant Issues

resolves #1190

What is in this change?

  • Create data connector that will scrape to X depth of links on site
  • Only finds links with matching domain name on site to scrape only links that are on the same website

Additional Information

Developer Validations

  • [x] I ran yarn lint from the root of the repo & committed changes
  • [x] Relevant documentation has been updated
  • [x] I have tested my code functionality
  • [x] Docker build succeeds locally

shatfield4 avatar Apr 26 '24 00:04 shatfield4

@timothycarambat, refactored based on what we discussed.

  • Creates array of all links so we know how many links before main scraping starts
  • Passes the array to bulk scraping function

shatfield4 avatar May 01 '24 01:05 shatfield4