module-deps icon indicating copy to clipboard operation
module-deps copied to clipboard

post transform file cache

Open DonutEspresso opened this issue 8 years ago • 3 comments

Hi folks, I have some tooling built on top of module-deps. Often I already have post transformed source and deps tree information ready to go. It looks like if I provide opts.cache it skips reading/parsing source and applying transforms. However, it's unclear to me what the format of a "file object" looks like. Poking through the walk() function it looks like I need at least source, package, and deps, so would the format be something like this?

opts.cache = {
  '/User/me/foo.js': {
    source: <string> (post transformed source code), 
    package: ?,
    deps: {
      './fooDep1': '/User/me/fooDep1.js',
      '../fooDep2': '/User/fooDep2.js'
    }
  }
};

My two questions are:

  • I see the transform stream emits other fields too (file, id, entry, etc.) but are those needed in the cache?
  • What should the value of package field be?
  • Any gotchas when it comes to providing cache for files in node_modules?

I'll start running my own tests and exploring, but any guidance would be appreciated. Thanks!

DonutEspresso avatar May 02 '16 22:05 DonutEspresso

Update: I captured the raw records being emitted by the transform stream on the data event, and saved those record objects. You can then feed those records into a new module-deps instance as the cache object and that seems to do the trick.

Is there any concern around this approach? AFAICT, this seems to work correctly and significantly speeds up subsequent runs of module-deps.

DonutEspresso avatar May 04 '16 01:05 DonutEspresso

@DonutEspresso I can't answer all of your questions, but I've been wanting that stuff to be better documented and more consistent for a long time. file|id are some of the worst offenders. See for example:

  • http://jmm.github.io/browserify-pipeline-docs/
  • https://github.com/substack/node-browserify/issues/1162
  • https://github.com/substack/node-browserify/issues/1203

Certain properties of those record objects are meaningful at certain phases of the pipeline (module-deps being one of the phases). I think package is the object represented by package.json. entry means is it an "entry" file, i.e. will it be executed when the bundle executes. Normally a b.add()ed file is an entry whereas a b.required() one isn't, for example.

You could also take a look at what watchify does, as I think it monkeys with that cache data, and its purpose is to speed up subsequent browserify bundling operations on mostly the same set of files.

Related: https://github.com/substack/module-deps/issues/72

jmm avatar May 05 '16 22:05 jmm

Thanks @jmm, appreciate the info and the links. I had a lot of trouble trying to "recreate" the records. I'd get inconsistent output from the stream when doing so, probably because I was feeding it bad data. Good to know I'm not the only one confused. :) In the end, I simply captured the emitted records "as-is" without changing them (all fields intact), then feeding them back in next time. That got me the consistent output from run to run, so it appears to be working so far

AFAICT, all the records emitted appeared to have id, source, file, deps. entry: true is there for all files I suspect that are added directly to module-deps via write() or end(). entry seems to be missing only when the file is located in node_modules, or is an unparseable file (i.e., require('./random.ext')), then random.ext would get emitted, but without the entry value.

I think it might be worthwhile delineating out what things are specific to module-deps, vs in the context of browserify.

DonutEspresso avatar May 06 '16 01:05 DonutEspresso