press icon indicating copy to clipboard operation
press copied to clipboard

Multiple servers and CDN support

Open rore opened this issue 13 years ago • 23 comments

I'm working on adjusting press to a server farm scenario (multiple servers - currently press doesn't really support that well). While working on it I encountered 2 small bugs:

  1. A potential null pointer exception in hasPressHeader if the file is empty
  2. A bug that prevented renaming the tmp file on windows (you need to close the writer before renaming).

rore avatar Sep 25 '11 20:09 rore

Hey, thanks for your interest in the plugin.

I've been trying to track down that rename bug for a while now, it seems so obvious now that you've pointed it out. Thanks!

I had a few questions about your pull request:

  • The way that press works at the moment it assumes that either you're using a single server with press files stored on disk, or you're using multiple servers with press files stored in the cache. Would it be possible for you guys to use the second storage strategy (ie files stored in the cache)?
  • I noticed that the eTag uses the hashcode of the file. Would it be possible instead to just use the cache key?
  • It looks like there's an extra directory parameter added in a few places. Could you explain what this parameter does?

These are just some little issues I noticed:

  • When generating the script/css tag, it looks like you're checking for whether contentHostingDomain is null, but as far as I can tell that should never happen
  • I think the libs you've included are aleady included by default in play
  • I believe a previous contributer made some changes that messed up the white space, if you could replace all tabs with four spaces I think that should clean those up.

Thanks again for you interest Dirk

dirkmc avatar Sep 26 '11 14:09 dirkmc

Hi,

OK, I didn't think all my pushes will be joined automatically to the pull request... There are some things there (those libs) that where not supposed to be part of the pull, so disregard those.

Now for the other stuff - let me first explain the motivation for the changes. The changes I made come to answer a need when working on multiple servers and with a CDN.

I still want press to work on optimum - that is store the compressed files on disk. But the issue was that it uses the hash key in the JS request, and if the server that got the request for the JS did not get the HTML request it wouldn't have the key in cache and the JS request will fail. So to handle that I've added the serverFarm configuration. If this is set to true, the script tag won't use the hash as the file name, but a combination of all the combined file names + timestamps. (Similar to how press build the combined name, just more parsable). Now when a server gets the JS request it doesn't need to have the key in the cache - it can rebuild the requested file list from the key itself and generate the compressed file if it doesn't exists.

The second issue is working with a CDN. With the combined script there's no problem since it contains the timestamp. That was not true for a single script. If the JS of a single script changes I want the CDN to fetch the new one. For this we need a cache buster in the file name. That's what the cacheBuster setting is doing - it adds the file time to the JS request, so when the file time changes the CDN will fetch it again. Now I also want to use press to generate those cache buster on library JSs that are already compressed - so I've added the option to not compress on a single-script. The last dir parameter on the single script allows specifying a directory that is different than the default JS directory (for instance, if your library JSs are on a different path than other JSs).

About the ETag - it looks like that's the way play handle this, I thought it was proper to stick to this convention.

Does it make more sense now?

On Mon, Sep 26, 2011 at 5:10 PM, dirkmc < [email protected]>wrote:

Hey, thanks for your interest in the plugin.

I've been trying to track down that rename bug for a while now, it seems so obvious now that you've pointed it out. Thanks!

I had a few questions about your pull request:

  • The way that press works at the moment it assumes that either you're using a single server with press files stored on disk, or you're using multiple servers with press files stored in the cache. Would it be possible for you guys to use the second storage strategy (ie files stored in the cache)?
  • I noticed that the eTag uses the hashcode of the file. Would it be possible instead to just use the cache key?
  • It looks like there's an extra directory parameter added in a few places. Could you explain what this parameter does?

These are just some little issues I noticed:

  • When generating the script/css tag, it looks like you're checking for whether contentHostingDomain is null, but as far as I can tell that should never happen
  • I think the libs you've included are aleady included by default in play
  • I believe a previous contributer made some changes that messed up the white space, if you could replace all tabs with four spaces I think that should clean those up.

Thanks again for you interest Dirk

Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-2198560

rore avatar Sep 26 '11 15:09 rore

Ah ok I think I understand what you're saying.

The reason the request key is a hash, instead of the original list of filenames is because if you have a large number of files, then the request key might become longer than the maximum limit for a GET request. That's why I use the cache to store the list of files, then retrieve it using the hashed request key.

If the request key is not found in the cache, then I think there is a problem with the cache implementation. The default cache implementation only works on a single server, but if you use memcache then it should work across multiple servers. Note that you can still store the compressed file on disk - if the compressed file is not found on the current server, it will be generated on the fly.

Regarding the single-script issue, I think if we set the last-modified/eTag signatures correctly the browser should be able to figure out if it needs to request a new copy of the file.

dirkmc avatar Sep 26 '11 15:09 dirkmc

Using memcached for this is not really a good measure, you pay the performance hit of another network request. Not to mention that you don't always want to put up a memcached cluster only for this.

I know the length of the key is a potential problem with this kind of solution, so when using it you need to make sure it doesn't go over the GET limitation. It's still better than having to use memcached.

About the single-script, setting the last modified won't do. The CDN needs to know that the file has changed in order to get a fresh copy. The last modified header does not give this information, it just enables the CDN to return the copy it has in case it fits the last modified it gets from the client. The only way the CDN will bring the file again is if you explicitly tell it to do so - and the easiest way is to add the file time on the request.

On Mon, Sep 26, 2011 at 6:20 PM, dirkmc < [email protected]>wrote:

Ah ok I think I understand what you're saying.

The reason the request key is a hash, instead of the original list of filenames is because if you have a large number of files, then the request key might become longer than the maximum limit for a GET request. That's why I use the cache to store the list of files, then retrieve it using the hashed request key.

If the request key is not found in the cache, then I think there is a problem with the cache implementation. The default cache implementation only works on a single server, but if you use memcache then it should work across multiple servers. Note that you can still store the compressed file on disk

  • if the compressed file is not found on the current server, it will be generated on the fly.

Regarding the single-script issue, I think if we set the last-modified/eTag signatures correctly the browser should be able to figure out if it needs to request a new copy of the file.

Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-2199391

rore avatar Sep 26 '11 15:09 rore

I can see what you mean about not wanting to pay the performance hit of an extra network request, and not wanting to set up a memcached cluster just for storing the set of file names. However I don't really like the idea of possibly breaking the GET limit either. I think for the majority of users the way press currently works will serve them better. I can see why you need to customize it in your case though.

dirkmc avatar Sep 26 '11 16:09 dirkmc

That's why it is enabled only by setting the press.serverFarm=true. The regular way press works is like now, which is fine for anyone having a single server. But if you want more than one server you can switch to the multi server mode (and you should be aware of the limitation).

On Mon, Sep 26, 2011 at 7:20 PM, dirkmc < [email protected]>wrote:

I can see what you mean about not wanting to pay the performance hit of an extra network request, and not wanting to set up a memcached cluster just for storing the set of file names. However I don't really like the idea of possibly breaking the GET limit either. I think for the majority of users the way press currently works will serve them better. I can see why you need to customize it in your case though.

Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-2200075

rore avatar Sep 26 '11 16:09 rore

Right. I'll take a look at in including it when I get a chance

dirkmc avatar Sep 26 '11 16:09 dirkmc

Thanks again for taking the time to improve the plugin

dirkmc avatar Sep 26 '11 16:09 dirkmc

Cool. I think this can be useful to a lot of people.

Thanks for supporting it!

On Mon, Sep 26, 2011 at 7:32 PM, dirkmc < [email protected]>wrote:

Thanks again for taking the time to improve the plugin

Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-2200211

rore avatar Sep 26 '11 16:09 rore

Hi,

I've added some additions to the changes I made. While testing on a multi-server setup I noticed there was a problem with the order the JS files are combined. This happened when a server that did not serve the HTML got the JS request and created it on the fly. Since originally press ordered the files after the script tag was written, the combined name was not in the right order.

I did not see anyway around it but to enforce the user to explicitly specify the order in the script tag. So now, if you activate the serverFarm mode, you have to specify the pos parameter in the script tag. Then the combined name will be generated according to the ordered files (sorted by the positions you give and not the order on the HTML).

Also, since in this mode we're keeping the order like that, there's no point anymore in searching the response for the extracting tags for getting the order - so in the serverFarm mode I'm skipping that. (And it actually cuts the performance overhead of doing that when generating the HTML).

Another improvement I added deals with your rightful concern about the length of the combined key. So now if the key is too long I break it up and create several script tags (as needed). So if you try to combine many files together and the name gets long it might create two combined files with less files in each. In most cases that won't happen, but it ensures JS requests won't break if you don't pay notice.

rore avatar Sep 27 '11 20:09 rore

Yes you're right that one of the reasons that the file list is stored in the cache is so that the order of the files can be detected after the output has been generated. I should have mentioned this in our conversation yesterday. This is something that I'd really rather not change, it's part of the reason that I originally developed the plugin. There is another plugin on which press is based called Greenscript that works more like what you're talking about.

dirkmc avatar Sep 27 '11 20:09 dirkmc

I've seen Greenscript but I already dived into press so I continued with it...

I understand the motivation for keeping it in cache but it's not valid if your application is on more than one server. So my latest addition handles that, it ensures the files will always be generated in the right order. You just need to specify the order as it can not be implicitly detected. (Again

  • only in the serverFarm mode. The default mode continues to work the same as now without the need to add the position).

On Tue, Sep 27, 2011 at 11:52 PM, dirkmc < [email protected]>wrote:

Yes you're right that one of the reasons that the file list is stored in the cache is so that the order of the files can be detected after the output has been generated. I should have mentioned this in our conversation yesterday. This is something that I'd really rather not change, it's part of the reason that I originally developed the plugin. There is another plugin on which press is based called Greenscript that works more like what you're talking about.

Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-2216199

rore avatar Sep 27 '11 21:09 rore

Hi,

I tried to work with this branch since we need multi-server support, but it did not work - the compressed JS is malformed (this is the copy-paste from the chrome console, sorry for the poor formatting):

Being Called
public%2Fthemes%2Fcommon%2Fstylesheets%2Freset%7C1320252712000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fgrid%2Fgrid-zemmy%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fui%2Fjquery-ui-1.8.16.custom%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fmodule%2Fmod%7C1321046015000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fmedia%2Fmedia%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fspacing%2Fspace%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fforms%2Fform-elements%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fbreadcrumb%2Fbreadcrumb%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Ftiptip%7C1320252712000.css:2Uncaught SyntaxError: Unexpected token var
 public%2Fthemes%2Fcommon%2Fstylesheets%2Ftables%7C1320252712000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Flinks%7C1321046015000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fplugins%2Fjquery.fileupload-ui%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fpagination%2Fpagination%7C1321535613000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fplugins%2Ftoken-input%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fbuttons%2Fcss3-buttons%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fhelpers%7C1321547370000.css:2Uncaught SyntaxError: Unexpected token {

Could it be that scripts that should not be compressed (marked as compress: false) get compressed anyway?

manuelbernhardt avatar Nov 18 '11 11:11 manuelbernhardt

Hi,

I haven't noticed a problem with the compress:false attribute. We're using it a lot and it works OK. Is your problem with JS or CSS? We mainly use the JS features currently, so maybe there's a bug in the CSS compression on this branch that I didn't notice.

On Fri, Nov 18, 2011 at 1:27 PM, Manuel Bernhardt < [email protected]

wrote:

Hi,

I tried to work with this branch since we need multi-server support, but it did not work - the compressed JS is malformed (this is the copy-paste from the chrome console, sorry for the poor formatting):

Being Called

public%2Fthemes%2Fcommon%2Fstylesheets%2Freset%7C1320252712000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fgrid%2Fgrid-zemmy%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fui%2Fjquery-ui-1.8.16.custom%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fmodule%2Fmod%7C1321046015000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fmedia%2Fmedia%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fspacing%2Fspace%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fforms%2Fform-elements%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fbreadcrumb%2Fbreadcrumb%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Ftiptip%7C1320252712000.css:2Uncaught SyntaxError: Unexpected token var

public%2Fthemes%2Fcommon%2Fstylesheets%2Ftables%7C1320252712000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Flinks%7C1321046015000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fplugins%2Fjquery.fileupload-ui%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fpagination%2Fpagination%7C1321535613000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fplugins%2Ftoken-input%7C1320752521000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fbuttons%2Fcss3-buttons%7C1321547370000%2Cpublic%2Fthemes%2Fcommon%2Fstylesheets%2Fhelpers%7C1321547370000.css:2Uncaught SyntaxError: Unexpected token {

Could it be that scripts that should not be compressed (marked as compress: false) get compressed anyway?


Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-2788111

rore avatar Nov 18 '11 12:11 rore

Indeed it was the CSS. See the pull request on your fork.

@dirkmc getting this into a release would really be useful.

manuelbernhardt avatar Nov 18 '11 22:11 manuelbernhardt

Does this path allow being able to automatically/ transparently upload compressed assets to a CDN (say S3 / Cloudfront) as part of the Press-build process? If so, great! If not, then what IS it for?

0xgeert avatar Dec 08 '11 14:12 0xgeert

It's not about uploading content to S3 but rather about routing your static assets through a CDN provider. So, lets say you have a js file, myfile.js. You want to compress it so you use press, it will create myfile.min.js. You want this file to be served through a CDN, so you set the parameters and the CDN host and it will be rendered in your HTML as http://cdn.something.com/myfile.min.js You want to use a cache buster on the file so if it changes in your app the CDN will refresh it so you set the press parameters for it and it will be rendered as http://cdn.something.com/myfile.min.js?12345 with query string or http://cdn.something.com/myfile.min.js.12345.press with non-query string.

On Thursday, December 08, 2011 4:23:22 PM, gbrits wrote:

Does this path allow being able to automatically/ transparently upload compressed assets to a CDN (say S3 / Cloudfront) as part of the Press-build process? If so, great! If not, then what IS it for?


Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-3062851

rore avatar Dec 08 '11 14:12 rore

Ok but let's take the example of Cloudfront. I assume in this situation my domain is http://something.comhttp://cdn.something.com/myfile.min.js.12345.press and I configured my subdomain http://cdn.something.comhttp://cdn.something.com/myfile.min.js.12345.press to point to a cloudfront distribution.

It doesn't support query-strings which you nicely support, so my HTML would render the URL as: http://cdn.something.com/myfile.min.js.12345.press But the file still has to be uploaded to the cloudfront-distribution to be available under http://cdn.something.comhttp://cdn.something.com/myfile.min.js.12345.press right? How do you get it there?

Thanks for painting a clear picture. Geert-Jan

Op 8 december 2011 15:42 schreef Rotem Hermon < [email protected]

het volgende:

It's not about uploading content to S3 but rather about routing your static assets through a CDN provider. So, lets say you have a js file, myfile.js. You want to compress it so you use press, it will create myfile.min.js. You want this file to be served through a CDN, so you set the parameters and the CDN host and it will be rendered in your HTML as http://cdn.something.com/myfile.min.js You want to use a cache buster on the file so if it changes in your app the CDN will refresh it so you set the press parameters for it and it will be rendered as http://cdn.something.com/myfile.min.js?12345 with query string or http://cdn.something.com/myfile.min.js.12345.press with non-query string.

On Thursday, December 08, 2011 4:23:22 PM, gbrits wrote:

Does this path allow being able to automatically/ transparently upload compressed assets to a CDN (say S3 / Cloudfront) as part of the Press-build process? If so, great! If not, then what IS it for?


Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-3062851


Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-3063055

0xgeert avatar Dec 08 '11 14:12 0xgeert

body p { margin-bottom: 0cm; margin-top: 0pt; } This is no longer a press question but a CloudFront question. You need to create a custom origin distribution and not an S3 distribution, and point it to your original server (or load balancer).

On Thursday, December 08, 2011 4:57:56 PM, gbrits wrote:
  Ok but let's take the example of Cloudfront.
  I assume in this situation my domain ishttp://something.com<http://cdn.something.com/myfile.min.js.12345.press>
  and
  I configured my subdomainhttp://cdn.something.com<http://cdn.something.com/myfile.min.js.12345.press>
  to
  point to a cloudfront distribution.
  It doesn't support query-strings which you nicely support, so my
  HTML would
  render the URL as:
  http://cdn.something.com/myfile.min.js.12345.press
  But the file still has to be uploaded to the
  cloudfront-distribution to be
  available underhttp://cdn.something.com<http://cdn.something.com/myfile.min.js.12345.press>
  right?
  How do you get it there?
  Thanks for painting a clear picture.
  Geert-Jan
  Op 8 december 2011 15:42 schreef Rotem Hermon<[email protected]
    het volgende:

    It's not about uploading content to S3 but rather about routing
    your
    static assets through a CDN provider.
    So, lets say you have a js file, myfile.js.
    You want to compress it so you use press, it will create
    myfile.min.js.
    You want this file to be served through a CDN, so you set the
    parameters and the CDN host and it will be rendered in your HTML
    ashttp://cdn.something.com/myfile.min.js
    You want to use a cache buster on the file so if it changes in
    your app
    the CDN will refresh it so you set the press parameters for it
    and it
    will be rendered ashttp://cdn.something.com/myfile.min.js?12345 with query string
    orhttp://cdn.something.com/myfile.min.js.12345.press with
    non-query
    string.
    On Thursday, December 08, 2011 4:23:22 PM, gbrits wrote:
      Does this path allow being able to automatically/
      transparently upload

    compressed assets to a CDN (say S3 / Cloudfront) as part of the
    Press-build
    process? If so, great! If not, then what IS it for?
      ---
      Reply to this email directly or view it on GitHub:https://github.com/dirkmc/press/pull/17#issuecomment-3062851

    ---
    Reply to this email directly or view it on GitHub:https://github.com/dirkmc/press/pull/17#issuecomment-3063055

  ---
  Reply to this email directly or view it on GitHub:https://github.com/dirkmc/press/pull/17#issuecomment-3063336

rore avatar Dec 08 '11 15:12 rore

Ahh yes that explains it. I didn't know of the custom origin feature. Many thanks!

Op 8 december 2011 16:00 schreef Rotem Hermon < [email protected]

het volgende:

body p { margin-bottom: 0cm; margin-top: 0pt; } This is no longer a press question but a CloudFront question. You need to create a custom origin distribution and not an S3 distribution, and point it to your original server (or load balancer).

On Thursday, December 08, 2011 4:57:56 PM, gbrits wrote: Ok but let's take the example of Cloudfront. I assume in this situation my domain ishttp://something.com< http://cdn.something.com/myfile.min.js.12345.press> and I configured my subdomainhttp://cdn.something.com< http://cdn.something.com/myfile.min.js.12345.press> to point to a cloudfront distribution. It doesn't support query-strings which you nicely support, so my HTML would render the URL as: http://cdn.something.com/myfile.min.js.12345.press But the file still has to be uploaded to the cloudfront-distribution to be available underhttp://cdn.something.com< http://cdn.something.com/myfile.min.js.12345.press> right? How do you get it there? Thanks for painting a clear picture. Geert-Jan Op 8 december 2011 15:42 schreef Rotem Hermon<[email protected] het volgende:

   It's not about uploading content to S3 but rather about routing
   your
   static assets through a CDN provider.
   So, lets say you have a js file, myfile.js.
   You want to compress it so you use press, it will create
   myfile.min.js.
   You want this file to be served through a CDN, so you set the
   parameters and the CDN host and it will be rendered in your HTML
    ashttp://cdn.something.com/myfile.min.js
    You want to use a cache buster on the file so if it changes in
   your app
   the CDN will refresh it so you set the press parameters for it
   and it
   will be rendered ashttp://cdn.something.com/myfile.min.js?12345with query string
    orhttp://cdn.something.com/myfile.min.js.12345.press with
    non-query
   string.
   On Thursday, December 08, 2011 4:23:22 PM, gbrits wrote:
     Does this path allow being able to automatically/
     transparently upload

   compressed assets to a CDN (say S3 / Cloudfront) as part of the
   Press-build
   process? If so, great! If not, then what IS it for?
     ---
     Reply to this email directly or view it on GitHub:

https://github.com/dirkmc/press/pull/17#issuecomment-3062851

   ---
   Reply to this email directly or view it on GitHub:

https://github.com/dirkmc/press/pull/17#issuecomment-3063055

 ---
  Reply to this email directly or view it on GitHub:

https://github.com/dirkmc/press/pull/17#issuecomment-3063336


Reply to this email directly or view it on GitHub: https://github.com/dirkmc/press/pull/17#issuecomment-3063376

0xgeert avatar Dec 08 '11 15:12 0xgeert

Hi gbrits,

Could you tell us what the status is of this pull-request? Are you going to merge it, ifso, when? Or why not?

Thanks in advance, Kamil

kamilafsar avatar Apr 10 '12 10:04 kamilafsar

Oh.. I guess I should have asked that to dirkmc! :-)

kamilafsar avatar Apr 10 '12 10:04 kamilafsar

Hello, to see why I'm not going to merge it you can see the conversation above.

dirkmc avatar Apr 10 '12 12:04 dirkmc