scrapedin-linkedin-crawler icon indicating copy to clipboard operation
scrapedin-linkedin-crawler copied to clipboard

We should avoid crawling related profiles already crawled

Open anasdox opened this issue 4 years ago • 4 comments

Hello,

I observed that the crawler does not remember the related profiles already crawled. And some time it produce an infinite crawling loop.

anasdox avatar Oct 18 '19 10:10 anasdox

same as #13

leonardiwagner avatar Oct 18 '19 14:10 leonardiwagner

I'm going to create an example of how this could be done and update the project

leonardiwagner avatar Oct 18 '19 14:10 leonardiwagner

To fix this, is fine to save crawled profiles on memory. Hard save the crawler state, in my opinion, is another feature.

anasdox avatar Oct 21 '19 10:10 anasdox

What is the use of "rootProfiles": [ "https://www.linkedin.com/in/place/", "https://www.linkedin.com/in/here/", "https://www.linkedin.com/in/profiles/", "https://www.linkedin.com/in/to-start-the-crawler/" ]

and

"relatedProfilesKeywords": ["react"] ?

I tried using "relatedProfilesKeywords": ["react"], to search only the profiles that have 'react' (as keyword) in their skill set, but the crawler doesn't seem to fetch data accordingly

PriyaJainDev avatar May 15 '20 11:05 PriyaJainDev