crawling topic

List crawling repositories

crawler

300
Stars
82
Forks
Watchers

🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)

double-agent

134
Stars
10
Forks
Watchers

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

proxifier

106
Stars
17
Forks
Watchers

A fast, modern and intelligent proxy rotator perfect for crawling and scraping public data.

Harvester

71
Stars
14
Forks
Watchers

Web crawling and document processing through a usable interface.

pomp

60
Stars
10
Forks
Watchers

Screen scraping and web crawling framework

talospider

54
Stars
4
Forks
Watchers

talospider - A simple,lightweight scraping micro-framework

wget-lua

83
Stars
14
Forks
Watchers

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

robots.txt

83
Stars
37
Forks
Watchers

Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.

telegram-crawler

237
Stars
26
Forks
Watchers

🕷 Automatically detect changes made to the official Telegram sites, clients and servers.