Reid Hewitt
Reid Hewitt
added new relic link crawler [here](https://one.newrelic.com/nr1-core/synthetics/monitor-overview/MTYwMTM2N3xTWU5USHxNT05JVE9SfDQ3MzQ4MDhlLWUyOTItNDExMi1iMzRmLTEzMzU2ODM5MjZjMQ?account=1601367&begin=1701185019790&end=1701186819790&state=17cfe8c8-7fc3-30d6-0316-87293a9325fc)
clicking a point in a location graph navigates to the list of links tested. there's a difference of tested links between htmlproofer and new relic. htmlproofer may be traversing more...
notes on new relic link checker: - so far it seems like there's no way to filter/ignore status codes - apparently a variety of status codes are monitored and should...
- I assume it would be able to alert us to 404's but surprisingly none have occurred for resource yet in any of the 6 locations in the monitor. -...
after upgrading `htmlproofer` from 3.x to 5.x to potentially address some issues the resources site produces 284 failures. this includes checks on links, images, scripts, and html validation. this is...
summary of failures using htmlproofer with the following flags: ignore-status-codes \"301,302,401,403,429\" --checks='Links,Images,Scripts,Html' --no-check-external-hash --no-check-internal-hash --no-enforce-https - datagov-11ty - 216 failures - resources.data.gov - 284 failures - data-strategy - 550 failures...
pausing work on this until group discussion on how we want to proceed.
htmlproofer offers a --only-4xx flag
here's the errors for the 4 static sites. this is the raw data from the terminal so if it's best i format them let me know. I used these flags...
[railway branch](https://github.com/GSA/datagov-harvesting-logic/tree/feature/railway). notes are in harvester/utils/railway.py and harvester/extract/__init__.py