web-crawling-guides icon indicating copy to clipboard operation
web-crawling-guides copied to clipboard

How to guides on web-crawling or scraping

Spider Logo

Spider Web Crawling and Scraping Guides

This repo contains a collection of guides on how to effectively use the Spider service to crawl or scrape. Contributors are welcome! 😁

Collection

  • Using the Spider API
  • How to Use Proxy Mode
  • LangChain + Groq + Spider = 🚀 (Integration Guide)
  • CrewAI Spider Stock Research
  • Extracting Contacts
  • Automated Cold Email Outreach Using Spider
  • How to Archive Full Website
  • Building A Speedy Resilient Web Scraper for RAG AI (Part 1, Part 2)
  • Agents from Scratch

Anti-Bot Detection

Spider, combined with the headless-browser repo, achieves full stealth against leading bot detection services — even when running fully headless.

Our techniques make Spider the most powerful crawling stack available today, providing an invisible footprint while scraping at scale.

Below are some screenshots proving Spider's stealth against major bot detectors:

Detector Screenshot
BrowserScan.net Bot Detection View Screenshot
Bot Detector Rebrowser View Screenshot
SammySoft Bot Ecom View Screenshot
Device and Browser Info (Are You a Bot?) View Screenshot
Fingerprint Ecom Playground View Screenshot
Device and Browser Info - Device Test View Screenshot
Creepjs - Device Test View Screenshot

Spider is designed for extreme evasion, high concurrency, and human-like behavior, allowing you to dominate even the most protected websites.

Contribute

We're happy to accept requests in the issue tracker, improvements to the content, and additional guides.