gotor icon indicating copy to clipboard operation
gotor copied to clipboard

This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.

gotor

Status/Social Links

CircleCI Open Source Helpers

This is a HTTP REST API and command line program for webcrawling Tor (and non Tor) sites.

Flags

Configuration of Tor client

  • -h SOCKS5 proxy host, defaults to localhost
  • -p SOCKS5 proxy port, defaults to 9050

REST

  • -server Starts HTTP server that provides a REST API to the crawling mechanisms
  • Current crawling mechanisms include: building relationship tree of links and getting the IP of the current tor client

CLI

  • -d Searching for children nodes of links, defaults to 1
  • -o Output destination, defaults to 'terminal' (recently added support for excel files)

How it works

Crawling drawio