incubator-pagespeed-mod icon indicating copy to clipboard operation
incubator-pagespeed-mod copied to clipboard

URL sent as a parameter to the CriticalImages.Run seems to be incorrect when Pagespeed is running behind proxy

Open tinodj opened this issue 5 years ago • 6 comments

Hi,

We have set the pagespeed to run behind proxy with Apache using ProxyPass and ProxyPassReverse.

All works fine, but we noticed this - when ModPagespeedCriticalImagesBeaconEnabled is true/enabled, then this javascript is served with an URL pointing to the hidden/proxied server:

e.g. example.com is our main domain and internal.example.com is server that is masked with a Proxy/ProxyReverse.

Then, pagespeed running on the internal.example.com - serves an URL here (html_url parameter in the code) that is pointing to the internal.example.com.

I am not sure whether and how this can be resolved, whether some new config, or some existing maybe works (I didn't found one, but also from the code itself I can't see that one exist). So, any help here would be appreciated, including confirming this as a bug.

https://github.com/apache/incubator-pagespeed-mod/blob/b0edead4c1e7c834f68ceb66dbfb478533a5c8af/net/instaweb/rewriter/critical_images_beacon_filter.cc#L119

tinodj avatar Oct 29 '20 00:10 tinodj

Hi Have you tried MapRewriteDomain? I´m not sure if this work because the directive is for change domain in the html code. And you can try this too: ModPagespeedBeaconUrl "http://my.other.server/my_beacon" https://www.modpagespeed.com/doc/filter-instrumentation-add#beacon_url

Lofesa avatar Oct 29 '20 09:10 Lofesa

Hi @Lofesa , thank you for your reply.

Yes, I've tried MapRewriteDomain, but as you said, it rewrites only in HTML, so didn't help.

Regarding the ModPagespeedBeaconUrl, I haven't tried but from the code, I can see it has been used at line 115, but on on 119. So, I don't believe this would help, but I will give a try.

tinodj avatar Oct 29 '20 14:10 tinodj

Ummm seems is only for the add_instrumentation filter.... What server name are uou using in the Vhost? Why you can´t use example.com in apache vhost?

Lofesa avatar Oct 29 '20 16:10 Lofesa

I am using internal.example.com in the Vhost. This is hosted on another server. And is included under example.com/internal, by using Proxy/ProxyReverse directives. I gave some thinking, and maybe you are right, I can fake the URL in the Vhost, but still this /internal might cause some trouble. In order to fix also that, I need to make on the origin server, the code to be served from same URL, so /internal. Yeah, I think you have your point here, this might be a working workaround.

Still, maybe URL here shouldn't be set as a parameter in the JS, but rather be taken from the current location from where this JS is run, isn't that easier/cleaner? Or am I missing something in such case?

Not sure what you think by is only for the add_instrumentation filter.... since here it is exactly the same code: https://github.com/apache/incubator-pagespeed-mod/blob/b0edead4c1e7c834f68ceb66dbfb478533a5c8af/net/instaweb/rewriter/add_instrumentation_filter.cc#L224

tinodj avatar Oct 29 '20 20:10 tinodj

Not sure what you think by is only for the add_instrumentation filter.... since here it is exactly the same code:

I think that because in the docs ModPagespeedBeaconUrl is only in the add_instrumentation filter, not in critical images or critical css

https://github.com/apache/incubator-pagespeed-mod/blob/b0edead4c1e7c834f68ceb66dbfb478533a5c8af/net/instaweb/rewriter/add_instrumentation_filter.cc#L224

Now I catch what the issue is... I think. Is not the url for the beacon him self rather than the url that the beacon has in their payload. Rigth? Some like:

https://example.com/ngx_pagespeed_beacon?url=https%3A%2F%2Finternal.example.com%2F

If that is the issue, I think that is rigth to have external.example.com in these url because in the pagespeed cache, optimized resources are stored with these url as part of the cache keys. In this:

user request (https://example.com/) --> proxy (https://internal.example.com) --> origin server

beacon need to update the pagespeed cache info for that page (https://internal.example.com) and don´t know nothing about (https://example.com/).

But maybe I´m wrong here, maybe @jmarantz can tell some words for this

Lofesa avatar Oct 30 '20 11:10 Lofesa

Hi, yes - you are right. It is about the payload, html_url parameter in the code. All is fine with the beacon url.

tinodj avatar Oct 30 '20 11:10 tinodj