goatcounter
goatcounter copied to clipboard
Handling of 404 and other status codes
Currently 404s to any URL shows as a hit to /404.html. Is this because I have <link rel="canonical" href="https://rigtorp.se/404.html"/> in the HTML? Might be worthwhile to update the documentation to make it clear to avoid the rel=canonical for 404s.
Ideally it should be possible to view error status code requests, so that they can be monitored and properly redirected. It would be great to receive a email summary about 404s etc.
I'm really liking GoatCounter so far, but my primary use case is to monitor 404s so I can add redirects or have the referrer site update the link. I would be happy to help with the implementation work for handling HTTP status codes.
Yeah, it always uses the canonical link if it exists and is on the same domain. Setting a canonical URL on a 404 is arguably not really a good idea, as the resource you're at (e.g. /page.html) doesn't really "canonicalize" to /404.html.
Getting the status code isn't possible from JavaScript as far as I know, I looked at this last year and as far as I can find the only way is to make a new XHR request and check the status code on that; I feel this is too hacky and unreliable to use. If you know a better way to do this then I'll be happy to add that.
My own solution for this at the moment is to leave the path alone and set the page title to 404; the "search" in the backend also looks at the title, so searching for 404 gives you what you want. You're already using 404 in your page title, and you should be able to skip using the canonical URL for /404.html with something like this (untested, but I think it should work):
<script>
window.goatcounter = {
path: function(p) {
if (p === '/404.html')
return window.location.pathname + window.location.search
return p
}
}
</script>
<script data-goatcounter="https://rigtorp.goatcounter.com/count" async src="//gc.zgo.at/count.js"></script>
But like I said, I'm not sure if adding the canonical here is a good idea in the first place.
Once custom fields are added (#191) you should be able to add a status_code field, although that would still require some special code in the form of "if path == 404 then status_code = 404", as this isn't something you can get from JS AFAIK.
Yeah I realized when looking in to this that the rel=canonical on a 404 doesn't make sense, the template added that by default.
I think it makes sense to mention what you just said in the documentation. IMHO a key use case of GoatCounter is when you are hosting on say GitHub Pages / Netlify / etc and want to monitor 404s. It would be great to have some automatic handling of non 2xx, but if there is no API to get the status code, I guess 404 pages needs to be manually tagged as "error pages".
I'm sure that custom fields will be useful to may people, but I think my use case of "monitor 404s for static hosted pages" is very common. It might be worthy of including in the short setup guide how to handle 404s.
What do you think about being able to get reports about 404s over email?
a key use case of GoatCounter is when you are hosting on say GitHub Pages / Netlify / etc and want to monitor 404s. It would be great to have some automatic handling of non 2xx [..] What do you think about being able to get reports about 404s over email?
Yeah, I fully agree, and I'd love to add a new chart for this and/or add them to email reports, the problem is just that there is no reliable way to get this information as far as I can tell. So it'll work for some but not for others, which is probably even more confusing. Something like Google Analytics also requires manual searching by page title to get this information, so I'll assume that if GA didn't find a solution for it, it unfortunately doesn't exist :-(
The plan to support this kind of reporting better is creating some kind of reporting interface where you can just say "display all pages where title like `404'", or "display all pages where field.status = 404", but that's kind of a long-term plan. I think for the time being, the above solution is "good enough" (or rather, "not bad enough to warrant writing code for which will be removed in a few months time to make it better in the intermediate period" 😅)
It might be worthy of including in the short setup guide how to handle 404s.
Yeah, I'll write a section on that :+1:
Right, so basically a 404 page needs a special <script> tag to indicate that it's a 404 or error page. Docs / FAQ could say that with GH pages add XXX to 404.html to monitor 404s and for Netlify add this to that file etc. That would be easy enough.
Right, so basically a 404 page needs a special
Not necessarily, on my own page the 404 error page has <title>404</title> and the path is left intact, which allows me to filter by it quite nicely. This would work well in your case too if it didn't have the canonical link, which is what makes things a bit more complicated for your site.
But yeah, it should be documented a bit more explicitly.
Not necessarily, on my own page the 404 error page has
<title>404</title>and the path is left intact, which allows me to filter by it quite nicely. This would work well in your case too if it didn't have the canonical link, which is what makes things a bit more complicated for your site.
Yeah, I'm fixing that right now. Why not add a link to that search somewhere on the dashboard in addition to documenting this.
But what do you link to? Some people might have 404, but others might have Page not found, Four-oh-four, Pagina niet gevonden (Dutch), etc. This is what I meant before with "it'll work for some but not for others, which is probably even more confusing" before.
I could add a setting for this I guess, but you need to be careful with accidental matches and the like as well. I'd rather wait 1 or 2 months until there's a more generic solution.
You could add to the docs that a 404 page needs to have 404 in the title. But yeah that's why I was saying that you need a specific <script> tag for the 404 page that tags the page as a error.