web-crawling topic

List web-crawling repositories

robots.txt

83
Stars
37
Forks
Watchers

Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.

Parser and database to index the terpene profile of different strains of Cannabis from online databases

crawler

300
Stars
11
Forks
Watchers

Library for Rapid (Web) Crawler and Scraper Development

Katastrophe

86
Stars
15
Forks
Watchers

Command Line Tool to download torrents

Scrapy-Craigslist

64
Stars
37
Forks
Watchers

Web Scraping Craigslist's Engineering Jobs in NY with Scrapy

ioweb

31
Stars
11
Forks
Watchers

Web Scraping Framework

amazon_scraper

76
Stars
20
Forks
Watchers

Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt

Compares price of the product entered by the user from e-commerce sites Amazon and Flipkart :moneybag: :bar_chart:

CrawlerX

21
Stars
15
Forks
Watchers

CrawlerX - Develop Extensible, Distributed, Scalable Crawler System which is a web platform that can be used to crawl URLs in different kind of protocols in a distributed way.

JAW

80
Stars
10
Forks
Watchers

JAW: A Graph-based Security Analysis Framework for Client-side JavaScript