goaccess icon indicating copy to clipboard operation
goaccess copied to clipboard

Panel for Missed - Top 50 Urls

Open mandys opened this issue 2 years ago • 7 comments

Hi,

In version 1.5.3, you had enhanced it to provide hits/misses for cloudfront log parsing. This is very helpful.

But, our custom scripts ( before we started using goaccess for almost everything :-) used to present the top 50 missed urls.

They were very helpful since we can then figure if caching on some object was not setup properly on our s3 bucket/website because of which we are getting so many missed requests.

Is it possible to create another panel for missed urls ?

Please let me know.

Thanks.

mandys avatar Dec 09 '21 11:12 mandys

Hello, are you referring to 404s?

allinurl avatar Dec 10 '21 01:12 allinurl

No, I am not talking about 404s.

So, cloudfront is a CDN that helps serving your files quickly from caches around the world.

But, it uses the headers of the files from the source.

Eg: We upload files to a s3 bucket. Cloudfront sources files from the s3 bucket. But, if files were not uploaded with correct caching headers ( like expires.. ) then cloudfront will serve the files without those headers.

That means every-time users request the files, they will be a MISS from cloudfront and not a HIT.

This helps us identify if we set the headers incorrectly for some files on s3 or any other source ( could be our webservers ).

Record for Miss

2021-12-08 11:31:19 FRA60-P1 34768 88.198.215.158 GET dummy.cloudfront.net /img/dummy/emails/datessec.jpg 200 - Mozilla/5.0%20(Windows%20NT;%20Trident/7.0;%20rv:11.0)%20like%20Gecko - - Miss KNOPAlj89CNZ4pwyNP52Vs6ISFE_5qimTgkJZozEOdRMDB_173_GaQ== dummy.cloudfront.net https 278 0.798 - TLSv1.3 TLS_AES_128_GCM_SHA256 Miss HTTP/1.1 - - 44227 0.798 Miss image/jpeg 34137 - -

Record for Hit

2021-12-08 11:31:23 FRA60-P1 34625 88.198.215.158 GET dummy.cloudfront.net /img/dummy/emails/datessec.jpg 200 https://outlook.live.com/ Mozilla/5.0%20(Windows%20NT%206.3;%20Win64;%20x64;%20rv:91.0)%20Gecko/20100101%20Firefox/91.0 - - Hit iSRLb84Ai1VGd1mxqiRKcMTvGYUBLJKgxNyJ_GzlejhOmWXGEDzOrA== dummy.cloudfront.net https 255 0.003 - TLSv1.3 TLS_AES_128_GCM_SHA256 Hit HTTP/2.0 - - 49915 0.003 Hit image/jpeg 34137 - -

mandys avatar Dec 10 '21 03:12 mandys

Hey @allinurl - any thoughts on this one ?

mandys avatar Dec 14 '21 08:12 mandys

I think this is going to be addressed easly by the #117 request. Since we already have a Cache Status panel, then you should either be able to filter by misses/hits/etc and it will show up those URLs. I'm working on that as we speak.

By the way, have you checked at the current Cache Status panel? Please take a look at the man page for more info.

allinurl avatar Dec 15 '21 00:12 allinurl

Are you referring to the attached image ? If yes, then I have been using it.

I think I pointed this out to you in a previous bug https://github.com/allinurl/goaccess/issues/1830

Option was there but it was case sensitive and hence not working which was later fixed in a new release. Screen Shot 2021-12-15 at 6 26 10 PM

mandys avatar Dec 15 '21 12:12 mandys

Got it. So just to sum up, the cache status panel is working fine, but you are looking now to filter out based on the cache status, e.g., hit, miss, etc?

allinurl avatar Dec 21 '21 04:12 allinurl

That is correct. But, more than hits, want to filter through "miss". That gives us information on which urls don't have proper caching headers.

Hits are hits :-)

mandys avatar Dec 21 '21 06:12 mandys