helix-importer-ui icon indicating copy to clipboard operation
helix-importer-ui copied to clipboard

[import] Proxy host cookie can lead to hit the wrong host

Open kptdobe opened this issue 2 years ago • 1 comments

Let's say we import https://www.a.com/page.html page. The requested page via the proxy is: http://localhost:3001/page.html?host=https://www.a.com. This will set the hlx-proxyhost cookie value to https://www.a.com so that resources referenced on the https://www.a.com/page.html page (like images, js, css...) will not need the host query parameter (if their url do not contain the host name). For example, an image with src="/image.png" will be served by the proxy from https://www.a.com without requiring the host query param because the cookie contains the host.

Now if during the import process we need some resources from another host, something like a json file https://www.b.com/sheet.json. We can leverage the proxy and request http://localhost:3001/sheet.json?host=https://www.b.com which will prevent the CORS issues. But... this request will re-set the hlx-proxyhost cookie value to https://www.b.com and then corrupt the import process: the subsequent requests without the host query param will try to fetch from b.com and not a.com like described above.

To solve that problem, we need to change how the cookie is set or consumed:

  • we want the cookie to be set to a.com so that we do not have to re-write all resources url / href / src to append the host query parameter
  • if we need to hit a different host, we can specify the host query param to get around the cookie value but NOT override the cookie value

As soon as we start dealing with different hosts, I think we need https://github.com/adobe/helix-cli/issues/2072.

cc @mhaack

kptdobe avatar May 03 '23 12:05 kptdobe

Can we set the cookie based on the URL of the imported page itself at the time you click import? With that it should be safe to not run into any race conditions.

mhaack avatar May 04 '23 07:05 mhaack