abotx
abotx copied to clipboard
Cross Platform C# Web crawler framework, headless browser, parallel crawler. Please star this project! +1.
What would be your recommended way of dealing with window.location changes on the page? I'm crawling sites that have a method that looks something like the following probably to break...
Recent changes to nuget have made some target framework installations ignore the install.ps1 of the PhantomJS 2.1.1 nuget package that AbotX relies on. The side effect is that the phantomjs.exe...
hello, i got error when trying to stop. System.OperationCanceledException: 'The operation was canceled.' error at [DoesNotReturn] private void ThrowOperationCanceledException() => throw new OperationCanceledException(SR.OperationCanceled, this); can you show me how to...
We are using AbotX in an application running on a containerized Ubuntu. Almost on every page crawl, a warning is logged which reads as `Cpu sampling implementation is not supported...
Hello! I downloaded the .lic from the repository and saved it in the root of the project. When I try to reproduce the example from README, I get an exception:...
The ParallelCrawlerEngine is getting the wrong URLs to crawl. Upon checking the page in the Parent URI, I could not find where it gets the wrong URL. It's probably the...
We would like to be able to retrieve the TimeSpan Elapsed property in the AllCrawlsCompleted event for the ParallelCrawlerEngine. Just like the other events like SiteCrawlCompleted. For now, we're using...