incubator-pagespeed-ngx icon indicating copy to clipboard operation
incubator-pagespeed-ngx copied to clipboard

ModifyCachingHeaders changes headers for all resources

Open injust opened this issue 8 years ago • 9 comments

After reading https://developers.google.com/speed/pagespeed/module/configuration#ModifyCachingHeaders, I felt that it was worth retaining the ETag and Last-Modified headers that PageSpeed strips by default.

pagespeed ModifyCachingHeaders off; would leave the original caching headers served by nginx for HTML resources, but still rewrite the caching headers for other PageSpeed optimized resources, e.g. CSS and JS.

When used, for HTML resources, the Cache-Control: max-age=0, no-cache header was removed and the ETag was added back, which is as described. However, it also seems to affect other resources, so the caching headers that PageSpeed applies for those resources are removed as well.

Is this unintended behaviour, or am I interpreting the documentation incorrectly?

injust avatar Nov 14 '16 03:11 injust

The scope of 'ModifyCachingHeaders' is HTML; it is not intended to affect resources. Are you saying it does affect resources? That is certainly not the intent.

You don't need the original etag on .pagespeed. URLs because the content-hash is in the URL, and the cache liftetime is 1 year. A new etag is placed on .pagespeed. URLs because old browsers (IE8 I think) wouldn't cache image requests without it.

Does this clarify things? If you've found a resource for which this is not the case, please give a more detailed example & re-open. Thanks!

jmarantz avatar Nov 14 '16 12:11 jmarantz

nginx configuration

error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
user www-data;

events {}

http {
    access_log /var/log/nginx/access.log;
    default_type application/octet-stream;
    include /etc/nginx/mime.types;
    index index.html index.php;
    ssl_certificate /etc/nginx/ecdsa/fullchain.pem;
    ssl_certificate_key /etc/nginx/ecdsa/privkey.pem;

    pagespeed FileCachePath /var/cache/pagespeed;
    #pagespeed ModifyCachingHeaders off;
    #pagespeed ShardDomain https://{domain}.ml https://{domain}.ga;
    pagespeed on;

    server {
        listen 443 ssl spdy http2;
        listen [::]:443 ssl spdy http2;
        root /var/www/asm;
        server_name {domain}.ga {domain}.ml;

        location ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+" {
            add_header "" "";
        }
        location ~ ^/pagespeed_static/ {}
        location ~ ^/ngx_pagespeed_beacon$ {}

        location ~ \.php$ {
            include /etc/nginx/fastcgi.conf;
        }
    }
}

Without ModifyCachingHeaders

mod_pagespeed on
Filters:
ah  Add Head
cc  Combine Css
jc  Combine Javascript
gp  Convert Gif to Png
jp  Convert Jpeg to Progressive
jw  Convert Jpeg To Webp
mc  Convert Meta Tags
pj  Convert Png to Jpeg
ws  When converting images to WebP, prefer lossless conversions
db  Debug
ec  Cache Extend Css
ei  Cache Extend Images
es  Cache Extend Scripts
fc  Fallback Rewrite Css 
if  Flatten CSS Imports
hw  Flushes html
ci  Inline Css
ii  Inline Images
il  Inline @import to Link
ji  Inline Javascript
js  Jpeg Subsampling
rj  Recompress Jpeg
rp  Recompress Png
rw  Recompress Webp
ri  Resize Images
cf  Rewrite Css
jm  Rewrite External Javascript
jj  Rewrite Inline Javascript
cu  Rewrite Style Attributes With Url
cp  Strip Image Color Profiles
md  Strip Image Meta Data

Options:
AvoidRenamingIntrospectiveJavascript (aris) True
EnableRewriting (e) 1
FileCacheInodeLimit (afcl) 500000
RewriteLevel (l) Core Filters

#NumFlushes            0
#EndDocument after     888us
#Total Parse duration  791us
#Total Render duration 1032us
#Total Idle duration   97us
No critical images detected.
The following filters were disabled for this request:
    SupportNoscript

Headers for CSS resource:

cache-control:max-age=31536000
content-encoding:gzip
content-type:text/css
date:Mon, 14 Nov 2016 15:40:14 GMT
etag:W/"0"
expires:Tue, 14 Nov 2017 15:40:14 GMT
last-modified:Mon, 14 Nov 2016 15:40:14 GMT
server:nginx/1.11.5
status:200
vary:Accept-Encoding
vary:Accept-Encoding
x-original-content-length:93603
x-page-speed:1.11.33.4-0

With ModifyCachingHeaders

mod_pagespeed on
Filters:
ah  Add Head
cc  Combine Css
jc  Combine Javascript
gp  Convert Gif to Png
jp  Convert Jpeg to Progressive
jw  Convert Jpeg To Webp
mc  Convert Meta Tags
pj  Convert Png to Jpeg
ws  When converting images to WebP, prefer lossless conversions
db  Debug
ec  Cache Extend Css
ei  Cache Extend Images
es  Cache Extend Scripts
fc  Fallback Rewrite Css 
if  Flatten CSS Imports
hw  Flushes html
ci  Inline Css
ii  Inline Images
il  Inline @import to Link
ji  Inline Javascript
js  Jpeg Subsampling
rj  Recompress Jpeg
rp  Recompress Png
rw  Recompress Webp
ri  Resize Images
cf  Rewrite Css
jm  Rewrite External Javascript
jj  Rewrite Inline Javascript
cu  Rewrite Style Attributes With Url
cp  Strip Image Color Profiles
md  Strip Image Meta Data

Options:
AvoidRenamingIntrospectiveJavascript (aris) True
EnableRewriting (e) 1
FileCacheInodeLimit (afcl) 500000
ModifyCachingHeaders (mch) False
RewriteLevel (l) Core Filters

#NumFlushes            0
#EndDocument after     3026us
#Total Parse duration  2833us
#Total Render duration 2087us
#Total Idle duration   193us
No critical images detected.
The following filters were disabled for this request:
    SupportNoscript

Headers for CSS resource:

content-encoding:gzip
content-type:text/css
date:Mon, 14 Nov 2016 15:41:57 GMT
server:nginx/1.11.5
status:200
vary:Accept-Encoding
vary:Accept-Encoding
x-original-content-length:93603
x-page-speed:1.11.33.4-0

HTML source does not change, both give:

<link rel="stylesheet" href="/res/css/A.fontello.css+paper.css+scrollbar.css+style.css,Mcc.pyRZM-96yH.css.pagespeed.cf.yyQUpjEnFC.css"><!--CSS not inlined since it&#39;s bigger than 2048 bytes--><!--Uncacheable content, preventing rewriting of https://{domain}.ml/res/img/logo.svg--><!--Uncacheable content, preventing rewriting of https://{domain}.ml/res/img/logo.svg-->

injust avatar Nov 14 '16 15:11 injust

Oh that should not happen ... assigning.

I think this must be nginx-specific.

jmarantz avatar Nov 14 '16 18:11 jmarantz

To confirm, when fetching /res/css/A.fontello.css+paper.css+scrollbar.css+style.css,Mcc.pyRZM-96yH.css.pagespeed.cf.yyQUpjEnFC.css, turning off ModifyCachingHeaders removes Cache-Control, ETag, Expires, and Last-Modified?

jeffkaufman avatar Nov 14 '16 18:11 jeffkaufman

Reproduced. I added a stanza to my test config:

    server {
        server_name test.example.com;
        root   /var/www-test;

        listen 80;
        pagespeed on;
        #pagespeed ModifyCachingHeaders off;
    }

Ran:

http_proxy=www.jefftk.com:80 headers test.example.com/A.test.css,qa=b.pagespeed.cf.X13vJaX7nm.css

Got the headers:

HTTP/1.1 200 OK
Content-Type: text/css
Connection: keep-alive
Server: nginx/1.11.4
Accept-Ranges: bytes
Date: Mon, 14 Nov 2016 18:33:43 GMT
Expires: Tue, 14 Nov 2017 18:33:43 GMT
Cache-Control: max-age=31536000
ETag: W/"0"
Last-Modified: Mon, 14 Nov 2016 18:33:43 GMT
X-Original-Content-Length: 317324
Vary: Accept-Encoding
Vary: Accept-Encoding
Content-Length: 317324
X-Page-Speed: 1.11.33.4-0

Then uncommented the pagespeed ModifyCachingHeaders off and got:

HTTP/1.1 200 OK
Date: Mon, 14 Nov 2016 18:34:14 GMT
Content-Type: text/css
Connection: keep-alive
Server: nginx/1.11.4
Accept-Ranges: bytes
X-Original-Content-Length: 317324
Vary: Accept-Encoding
Vary: Accept-Encoding
Content-Length: 317324
X-Page-Speed: 1.11.33.4-0

jeffkaufman avatar Nov 14 '16 18:11 jeffkaufman

Is there any update on that issue? We are still experiencing it.

stufan avatar Jun 17 '18 20:06 stufan

@stufan No updates, this issue is tracking status.

If someone would like to fix this one, taking a look at what happens when ctx->preserve_caching_headers == kDontPreserveHeaders in https://github.com/apache/incubator-pagespeed-ngx/blob/master/src/ngx_pagespeed.cc is probably key to solving this.

oschaaf avatar Jun 17 '18 20:06 oschaaf

I dig a bit into the code and I came to some findings but I am not quite sure how good are they to be applied. Basically if I comment

    if (preserve_caching_headers == kPreserveAllCachingHeaders) {
      if (StringCaseEqual(name_gs, "ETag") ||
          StringCaseEqual(name_gs, "Expires") ||
          StringCaseEqual(name_gs, "Date") ||
          StringCaseEqual(name_gs, "Last-Modified") ||
          StringCaseEqual(name_gs, "Cache-Control")) {
        continue;
      }
    } else if (preserve_caching_headers == kPreserveOnlyCacheControl) {
      // Retain the original Cache-Control header, but send the recomputed
      // values for all other cache-related headers.
      if (StringCaseEqual(name_gs, "Cache-Control")) {
        continue;
      }
    }  // else we don't preserve any headers.

in copy_response_headers_to_ngx , then caching headers appear for pagespeed resources appear together with the HTML original cache control headers. It looks like when preserve_caching_headers == kPreserveAllCachingHeaders or preserve_caching_headers == kPreserveOnlyCacheControl the caching headers generated by Pagespeed (for Pagespeed resources) are skipped / not added to the response.

Also it seems when ModifyCachingHeaders = off it is set for all resources and not HTML only according to this code in ps_resource_handler:

    if (!options->modify_caching_headers()) {
      ctx->preserve_caching_headers = kPreserveAllCachingHeaders;
    } else if (!options->IsDownstreamCacheIntegrationEnabled()) {
      // Downstream cache integration is not enabled. Disable original
      // Cache-Control headers.
      ctx->preserve_caching_headers = kDontPreserveHeaders;
    } else if (!pagespeed_resource && !is_an_admin_handler) {
      ctx->preserve_caching_headers = kPreserveOnlyCacheControl;
      // Downstream cache integration is enabled. If a rebeaconing key has been
      // configured and there is a ShouldBeacon header with the correct key,
      // disable original Cache-Control headers so that the instrumented page is
      // served out with no-cache.
      StringPiece should_beacon(request_headers->Lookup1(kPsaShouldBeacon));
      if (options->MatchesDownstreamCacheRebeaconingKey(should_beacon)) {
        ctx->preserve_caching_headers = kDontPreserveHeaders;
      }
    }

which doesn't look quite right. Is there anyway to understand that the request is only for an HTML resource and not something else?

stufan avatar Jun 18 '18 14:06 stufan

Is there any recent option that helps resolve this?

XVII avatar Jun 01 '20 14:06 XVII