robots-txt topic

List robots-txt repositories

useful-links

47
Stars
2
Forks
Watchers

List of useful links, tools and resources

nuxt-humans-txt

29
Stars
1
Forks
Watchers

🧑🏻👩🏻 "We are people, not machines" - An initiative to know the creators of a website. Contains the information about humans to the web building - A Nuxt Module to statically integrate and generate...

WebScraper

61
Stars
14
Forks
Watchers

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

waybackrobots

34
Stars
3
Forks
Watchers

Enumerate old versions of robots.txt paths using Wayback Machine for content discovery

astro-launchpad

44
Stars
12
Forks
Watchers

An Astro project template for decent projects: auth, i18next, Bootstrap, sitemap, webworker, robots.txt, preact, react, endpoints, endpoint clients, OAuth, various Astro features and data loading prec...

ai-training-opt-out

59
Stars
2
Forks
Watchers

Known tags and settings suggested to opt out of having your content used for AI training.

weboptout

192
Stars
1
Forks
Watchers

Opt-Out tool to check Copyright reservations in a way that even machines can understand.

jsitemapgenerator

40
Stars
12
Forks
Watchers

Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming

robots.txt

15
Stars
1
Forks
Watchers

:robot: robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API

robotify-netcore

16
Stars
5
Forks
Watchers

Provides robots.txt middleware for .NET core