nrobots icon indicating copy to clipboard operation
nrobots copied to clipboard

nrobots doesn't handle entries like /?/ properly

Open sirjimjones opened this issue 9 years ago • 2 comments

nrobots doesn't handle entries like "Disallow: /?/" properly.

sirjimjones avatar Mar 23 '15 10:03 sirjimjones

Looks like NRobots is converting the "Disallow: /?/" into "Disallow: /". The last checkin to that lib fixed a similar issue but obviously this presents other issues.

sjdirect avatar Mar 24 '15 18:03 sjdirect

Even though this bug still exists in the nrobots lib, abot gives a workaround to this and similar issues with robots.txt preventing the crawl. See https://github.com/sjdirect/abot/commit/9bd3d7d91ebefb6e03ee2c2a1b5140cc4020073c for details. There is now an isIgnoreRobotsDotTextIfRootDisallowedEnabled config value that if set to true will ignore the robots.txt file for when the root uri of the crawl is disallowed.

sjdirect avatar Mar 31 '15 07:03 sjdirect