go-neb
go-neb copied to clipboard
Some valid atom and rss feeds are not recognized
This atom feed is valid:
https://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.qdep.org%2Ffeed%2Fatom%2F
but returns this error:
HTTP 500: Failed to register service: Failed to read URL http://www.qdep.org/feed/atom/: Failed to detect feed type
This rss feed seems to also be valid:
https://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.qdep.org%2Ffeed%2F
but returns the same error
We use https://github.com/mmcdole/gofeed to parse Atom/RSS feeds. I'll bring it up with them.
I discussed this in more detail in https://github.com/mmcdole/gofeed/issues/75 but, just wanted to circle back here.
It would appear that the server in question is behind an Incapsula WAF, and their security settings are blocking both 'curl' requests, and requests made by golang's http.client (with it's default settings / user agent).
As @mmcdole states, it looks like this site is doing User-Agent sniffing and rejecting bots. We actually set our own User-Agent already sooo.. I don't think there's much we can do about this.
Than's for the update. Would this also be the reason that a Feedburner version of the feed gets rejected as well? At first the feed is accepted as valid, but come to check back on it later, and it has a red error icon next to it: https://feeds.feedburner.com/QueerDetaineeEmpowermentProject
This also happens if you include the feed type as well, such as https://feeds.feedburner.com/QueerDetaineeEmpowermentProject?format=atom
Also doesn't work for:
https://github.com/matrix-org/go-neb/commits/master.atom