crawling topic

List crawling repositories
trafficstars

ferret

5.6k
Stars
299
Forks
Watchers

Declarative web scraping

crawlee

14.3k
Stars
597
Forks
Watchers

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...

rod

4.9k
Stars
320
Forks
Watchers

A Devtools driver for web automation and scraping

N2H4

211
Stars
74
Forks
Watchers

네이버 뉴스 수집을 위한 도구

isp-data-pollution

583
Stars
53
Forks
Watchers

ISP Data Pollution to Protect Private Browsing History with Obfuscation

newspaper

13.8k
Stars
2.1k
Forks
Watchers

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

scrapyrt

817
Stars
160
Forks
Watchers

HTTP API for Scrapy spiders

scrapy-selenium

898
Stars
330
Forks
Watchers

Scrapy middleware to handle javascript pages using selenium

second-order

360
Stars
64
Forks
Watchers

Second-order subdomain takeover scanner

easy-scraping-tutorial

766
Stars
551
Forks
Watchers

Simple but useful Python web scraping tutorial code.