scrapedin-linkedin-crawler
scrapedin-linkedin-crawler copied to clipboard
We should avoid crawling related profiles already crawled
Hello,
I observed that the crawler does not remember the related profiles already crawled. And some time it produce an infinite crawling loop.
same as #13
I'm going to create an example of how this could be done and update the project
To fix this, is fine to save crawled profiles on memory. Hard save the crawler state, in my opinion, is another feature.
What is the use of "rootProfiles": [ "https://www.linkedin.com/in/place/", "https://www.linkedin.com/in/here/", "https://www.linkedin.com/in/profiles/", "https://www.linkedin.com/in/to-start-the-crawler/" ]
and
"relatedProfilesKeywords": ["react"] ?
I tried using "relatedProfilesKeywords": ["react"], to search only the profiles that have 'react' (as keyword) in their skill set, but the crawler doesn't seem to fetch data accordingly