amber icon indicating copy to clipboard operation
amber copied to clipboard

Asset Helpers

Open jackturnbull opened this issue 6 years ago • 20 comments

Description of the Change

Fixes: https://github.com/amberframework/amber/issues/555 See previous change: https://github.com/amberframework/amber/pull/574 Builds upon @JoshJuncker's port of @paulcsmith's work of the Lucky::AssetHelpers. All credit to him for the original concept (I've kept the attribution the same as the previous PR, please let me know if you'd like this tweaking).

This adds the following changes:

  • Amber::Controller::Helpers class available in all controllers providing:
    • macro asset_path(path) to reference an asset against it's path
    • def dynamic_asset_path(path) when referencing assets by string interpolation.
  • asset_manifest.json within the config/ directory containing a key/value reference to all assets for use within Amber
  • Split webpack production and development configs to allow for hashing only on the production environment. This allows you to reference asset paths in both development and production without being concerned about their filename on disk:
    • asset_path("main.js") in development => "/dist/main.bundle.js"
    • asset_path("main.js") in production => "/dist/main-a1b2c3d4e5f67890.bundle.js"

I also like the last point as it acts as an automatic cache-buster without worrying about etag implementations (it'll reflect a new filename in the DOM).

Alternate Designs

I'd mostly tweaked Josh's original PR, which in turn is heavily inspired from Lucky's implementation so alternate designs mostly haven't been considered - much less a clean-room implementation of Lucky's work (somewhat unnecessary, I'll probably end up with something similar and less elegant). I'll mention some of the design decisions below:

config/asset_manifest.json; the placement of this file has been considered somewhat carefully. I've opted to not place this in public as I believe having it exposed to the internet could be a security risk to some people in some situations (although not myself). Keeping it within config/webpack wouldn't work for me personally as I use rollup for my asset bundling and I'd rather not be tied in to a directory for webpack as it's a tool I don't use. I'm using myself as the use-case here but I believe it to be valid.

macro load_manifest; I'd made the desicion to not port this from Lucky for a number of reasons, some more valid than others. One being that there doesn't appear to be a good place for this to live. Instructing the server to load an asset manifest is decidedly not a controller helper and placing it into the server class felt out of place, but I believe I've managed to abstract it away via the webpack usage. I believe it's purpose is to allow API application to skip the use of the manifest helper, which is perfectly valid, but if you're using an API only application and don't reference the asset helpers it won't make a difference. edit: correction: this hasn't been abstracted away. As it stands all Amber users require an asset_manifest file to be JSON parse-able. If we forsee this as an issue, my suggestion is to add a new environment config option of use_asset_manifest: true which then allows us to either require the manifest and generate the helpers, or skip it entirely.

default contents of asset_manifest; because we know the assets in advance and the hashes aren't required for a successful amber watch we can simply insert a manifest of the current built assets and allow it to run first-time.

Benefits

See above and description.

Possible Drawbacks

There are a couple I can see, as it stands that warrant some discussion.

  • A bug in the watch workflow means that:
    • The amber crystal project will build
    • Then the npm process will kick off a watch job, replacing asset_manifest.json with the same contents.
    • This in turn will cause Amber to re-build as sentry has detected a file update.

Requesting some assistance on the above, or if anyone can see a better way of handling this.

Another, albeit somewhat minor issue is communicating the change to the release process. The asset_manifest.json is compiled in to the binary so the manifest will need to be updated with the production references prior to compilation.

jackturnbull avatar Feb 10 '19 12:02 jackturnbull

This is causing some spec failures:

 crystal spec $TEST_SUITE
Error in line 1: while requiring "././spec/amber/cli/templates/migration_spec.cr"
in spec/amber/cli/templates/migration_spec.cr:1: while requiring "../../../spec_helper"
require "../../../spec_helper"
^
in spec/spec_helper.cr:21: while requiring "../src/amber"
require "../src/amber"
^
in src/amber.cr:12: while requiring "./amber/controller/**"
require "./amber/controller/**"
^
in src/amber/controller/base.cr:4: while requiring "./helpers/*"
require "./helpers/*"
^
in src/amber/controller/helpers/asset.cr:6: macro didn't expand to a valid program, it expanded to:
================================================================================
--------------------------------------------------------------------------------
   1. Manifest at /home/travis/build/amberframework/amber/config/asset_manifest.json does not exist
   2. Make sure you have compiled your assets
--------------------------------------------------------------------------------
Syntax error in expanded macro: macro_139788914770368:1: unknown token: '\e'
Manifest at /home/travis/build/amberframework/amber/config/asset_manifest.json does not exist
^
================================================================================
    {{ run "../../run_macros/generate_asset_helpers" }}
    ^
The command "crystal spec $TEST_SUITE" exited with 1.

robacarp avatar Feb 25 '19 18:02 robacarp

Sorry about that! I've just got back from a bit of a long holiday hence the slow response! I'll try and get it patched up over the next couple of days

jackturnbull avatar Mar 04 '19 07:03 jackturnbull

@jackturnbull will you be able to work on this? If not, does someone else want to take it over?

drujensen avatar May 19 '19 02:05 drujensen

@drujensen sorry about this one; don't have a much better reason other than I totally forgot about this PR. I've taken a look at it now and the fix isn't obvious, in the sense I can't see how to fix this without compromise.

What's happening? So right now we have a macro that looks for an asset manifest JSON file and then parses it so that the asset locations can be stored in a constant in memory. This works fine in a generated Amber appliaction as we should have an asset_manifest.json, and if we don't, the warning and throw is desired. This is complaining within the test suite because we're requiring in this helper (which in turn runs the macro that looks for the asset_manifest.json file) through a glob in src/amber.cr and it cannot find the asset_manifest.json.

How can we fix it? I could create a separate require list within spec_helper so this macro isn't generared within a test suite, although I suspect the main app require list and the spec require list will diverge at some point and maintaining both feels wrong. Alternatively I could also have the macro only run when not in the test suite, although I'm not massively keen on restricting code paths per-env within the code source itself. If anyone can tell me how to do this at the macro level I'm happy to implement; it appears that AMBER_ENV and ENV is not set at compilation time under crystal spec.

How have I fixed it? I've taken the Lucky approach and modified it slightly. I'm using Support as the entrypoint for asset helpers and the pattern matches the current ClientReload.new call in the main {{name}}.cr. My battery is about to die so I need to test it more thoroughly, and I'm seeing some SQLite DB errors in the test suite, which I suspect is likely due to misconfiguration my side of things.

As long as I haven't made any major mistakes in the generated output this should be a lot closer.

jackturnbull avatar May 20 '19 04:05 jackturnbull

I just want to toss some ideas around on this subject so please bear with me.

  • we only need a version number in filenames in production environment
  • when we deploy to production we have a number of commits which we can get with git rev-list master --count
  • a count of git commits (were it to appear in an asset filename) is more or less the same as a build time webpack hash in that it uniquely identifies a point in the git history but it only changes on a new commit
  • since we should commit assets to git, updated assets will update the commit count

So....

  • can we extract the git commit count at build time and save it somewhere e.g. as a value in .amber.yml ?
  • could we have a macro that uses the git commit count extracted from .amber.yml as an asset filename prefix/suffix/infix ?
  • could we add a pipe at the front of the web pipeline that strips the commit count prefix/suffix/infix from the URL path e.g. /v_198_/dist/logo.png becomes /dist/logo.png ?

I think we could do this with an Amber CLI program to build a production release. This is a simple solution - but sometimes simple solutions work.

damianham avatar May 23 '19 17:05 damianham

Hi @damianham thanks for thowing some suggestions in, and a quick thank you for your module recipe - I didn't end up using it directly but it was helpful to see it applied to amber to guide my own project layout!

I think realistically Dru or another contributor more active than myself would be able to weigh in with more precise thoughts than myself, but here are some of the question/concerns I'd have with this approach. Any that don't apply because of a misunderstanding on my part feel free to point it out;

  • What if someone isn't using git? (I know it's becoming a little unlikely in this day and age, but hear me out)
    • People might be running their amber CI pipeline from their own machine without realising it. I'm currently working on a solo project in a pre-live state (think startup before go-live) and I'm regularly pushing things up into the prod cluster without creating a commit. In this sense I'm not using git at all.
    • We'd be tying all amber users' to git. It would go from being Crystal + Amber + Webpack to Crystal + Amber + Webpack + Git, which I'd consider to be a large surface area increase.
  • Someone might not be committing their assets to the git repo. I'm not the use case here, but it wouldn't surprise me if someone wanted to reverse proxy their /dist path onto CDN, uploading the assets as build time.
  • I've not fully thought out the implications but it wouldn't surpise me if git-rebase/squash had some edge cases in this implementation (or at least we'd have to start accounting for them one by one as we spotted them).

It reads to me like it could be one of those cases where it'll be an ideal solution for a lot of people, but a blocker for everyone not in that group.

jackturnbull avatar May 24 '19 08:05 jackturnbull

Hi @jackturnbull - it's good to hear someone is using one of my recipes other than me :) glad you found it useful.

I think if we had an Amber CLI program to build and deploy to production we could generate an infix identifier at the time of build either from git or using the unix timestamp at build time. Either way if we use an infix identifier that is saved in the config then we could strip out the infix ID in a pipe in the static pipeline to get to the real asset path.

If someone wanted to reverse proxy their /dist path onto CDN then I think the infix ID would be better after the /dist/ component, e.g. /dist/v_156674896_/logo.png becomes /dist/logo.png. Rather than using a pipe in the static pipeline we could create the /dist/v_156674896 folder as a symbolic link to the real asset folder path e.g. /dist/v_156674896 -> /dist/assets. By creating a new symbolic link that is inserted into the asset path by the asset_path() macro at build time then we outdate all assets.

If new versions of assets are being uploaded to the assets folder without a new build then Etag could be used to outdate cached assets that have been modified so we probably want something in the static pipeline to check the modification time of an asset against the client supplied Etag.

damianham avatar May 24 '19 09:05 damianham

@damianham @jackturnbull My biggest concern is building a dependency between Amber and JS. We do not want to be tied to any JS framework or technology stack. We want to be agnostic to the front end to allow developers to choose their own. Any solution we provide here should be replaceable with any other solution.

@jackturnbull has a good point about adding a git dependency. We should be looking for a solution that falls in the JS camp and avoid adding any dependencies in Amber if at all possible.

drujensen avatar May 24 '19 14:05 drujensen

@drujensen I agree it is good point about refraining from adding dependencies on git or JS. I think my suggestion for an Amber CLI program to build (and possibly also deploy) an Amber project for production and at the same time generate an infix ID (using Time.to_unix) to be used in the asset_path macro keeps the solution all within Amber with no dependency on git or JS. The question is whether to us an infix ID which is then stripped from the URL path by a pipe in the static pipeline or whether to create a symbolic link to the assets folder so no pipe is required to strip the infix ID.

For deployment I envisage the CLI program would use an adapter class to deploy to various hosting scenarios or create their own adapter for bespoke deployment.

damianham avatar May 24 '19 15:05 damianham

@damianham At some point we had this. It was removed here: https://github.com/amberframework/amber/commit/4504b6aab90bef014dbe397edf7aa39865605ad7#diff-394c0b83912ae9cd9ac195beff4bbf8d

drujensen avatar May 24 '19 15:05 drujensen

The reason that amber deploy was abandoned is that it's way outside the scope of responsibility of a web framework to try and solve all the various deployment methodologies. Similar to how it's best not to couple a server side framework with a javascript library, coupling the framework with a deployment methodology (or system of methodologies) is going to weigh down the framework.

robacarp avatar May 24 '19 16:05 robacarp

I do agree that deployment is outside the scope of responsibility of a web framework however I suggest this as a simple solution to the updating cached assets problem. It doesn't necessarily need to weigh down the framework because the deployment adapters could be provided by shards just as we choose a shard for our ORM. Adapters would allow for any deployment paradigm and provide a simple (for the user) solution to deployment. In other projects I use Mina.js for deployment to a VPS and maintain a deploy.json file which contains all the deployment information so deployment is simply "mina deploy".

I suggest a deployment CLI program because by creating a symlink to the assets folder that would obviate the need for a path rewrite pipe in the static pipeline.

Maybe we could start with an infix ID for the asset_path() macro stored in .amber.yml and a corresponding symbolic link in public/dist to an assets folder. A developer could then purge cached assets by updating the infix ID in .amber.yml, creating a new sym link and rebuilding the application.

damianham avatar May 24 '19 16:05 damianham

@damianham I have seen versioning done this way in distribution tools. For example, snapcraft from Ubuntu provides a CLI to run snap refresh {app_name} which will download the latest version and update a symlink to the root product. This allows you too easily rollback to a previous version by updating the symlink.

I agree with the approach but i'm not sure this should be in the Amber CLI instead of a JS solution. Do we really want this custom solution in Amber CLI when there are npm packages that deal with this already? Will this be compatible with other JS tooling solutions?

drujensen avatar May 24 '19 21:05 drujensen

@drujensen yes using a sym link makes rollback very simple and Mina works that way. My suggestions are based on a desire to make the Amber experience as easy and productive as possible for developers. I don't really have a dogmatic viewpoint on whether it should be an Amber CLI solution or a JS solution but given that I think we would use an asset_path macro at build time to generate paths to at least the main.bundle.js and main.bundle.css assets I don't see how it can be a JS solution and the Amber solution is a single solution that would work for all asset packagers. Even if we can use webpack or another asset packager to replace the hash in the filename of the assets in application layout prior to building the application would it also work for assets named in other templates ?

damianham avatar May 25 '19 15:05 damianham

@damianham @jackturnbull

Couple questions:

  • Isn't it better to change the name of the file using a hash to make sure the cache is refreshed when assets change? I don't think a symlink will do this.

  • Should we use a caching manifest to allow for assets to be cached using service workers? https://en.wikipedia.org/wiki/Cache_manifest_in_HTML5

drujensen avatar May 26 '19 03:05 drujensen

@drujensen I think a sym link to a folder will do exactly that. The assets used by the application become new assets referenced by the application when the asset path changes. I don't think anything else is needed to effectively purge assets from a web cache. Service workers run their own cache and will request new assets from the application when an asset path changes.

damianham avatar May 28 '19 09:05 damianham

@damianham nm, brain fart. The symlink is only in development, right? The Id or Hash will be referenced in production to break the cache.

drujensen avatar May 28 '19 16:05 drujensen

@drujensen no worries about brain farts - I have 'em all the time :)

Actually I think the sym link is only in production, the idea being to change the path to the asset e.g. /dist/156789/main.bundle.js -> /dist/24567/main.bundle.js where /dist/156789 and /dist/24567 are both a symlink to /dist/assets. When the asset_path macro is used it would give /dist/156789/main.bundle.js for asset_path("main.bundle.js") given that .amber.yml has the value asset_path: /dist/156789. Subsequently in order to purge these assets from global caches one would change the asset_path value in .amber.yml to /dist/24567 and make a new symlink /dist/24567 -> /dist/assets. I don't see any difference in this scheme to /dist/main.bundle.1234567.js. As I see it both will purge caches equally.

If you had assets that are unlikely to change over time you could simply link to them directly rather than use the asset_path macro.

damianham avatar May 29 '19 09:05 damianham

Sorry everyone for the pace here, I'm no longer using Amber in my current project so finding time to get this finished has been a little more difficult.

It's a bit of a cop-out answer but I personally see the symlink solution as more confusing than it's worth, and my personal opinion is that ETag headers are also a bit of a headache too - the concept is easy enough but I've seen configurations where caches ignore them and have spotted that they can be difficult to parse for junior developers - even if they're the correct solution most of the time.

Adding a hash/fingerprint seems to be the most common/recognisable way of handling this, the original PR that predates mine was driven by a Lucky feature but I believe all of them in some way borrow from the Rails asset pipeline which still uses fingerprinted assets.

I will say though I do like the thought of having a directory for all assets, it makes managing things on the filesystem a little easier, I'd also also concerned about future Windows compatibility and possible quirks with updating them. As it stands, webpack does all the lifting for us in this PR - it assigns hashes and can spit out a manifest for us as well, all we've got to do in Crystal is read the manifest and lookup the asset key.

If I was to cast my vote it would be for what we currently have, as it feels more standard and doesn't requrie us to write our own symlink handler.

Should we use a caching manifest to allow for assets to be cached using service workers? https://en.wikipedia.org/wiki/Cache_manifest_in_HTML5

This is a good idea and I think this adds value - if anyone wants to take over this and add it I'd be happy to lend an ear, but I also think the PR could be merged without it if we get agreement on the approach.

jackturnbull avatar Sep 13 '19 03:09 jackturnbull

@jackturnbull hey was wondering if you can give me a hand to bring this to life, I would like to re-eval your work around assets. Currently there seems to be two simple tests failing and might not required a big lift on your end.

See failing tests here https://travis-ci.org/github/amberframework/amber/jobs/715929796#L1315

eliasjpr avatar Aug 07 '20 18:08 eliasjpr