browserify-persist-fs icon indicating copy to clipboard operation
browserify-persist-fs copied to clipboard

Different input files with the same content get cached to the same file

Open josephwarrick opened this issue 8 years ago • 5 comments

If two different files have exactly the same content, they will both get cached to the same cache file.

This can happen most frequently when multiple packages require different versions of a given node_module. That is, foo requires [email protected], and baz requires [email protected], so the content of the file at "node_modules/foo/node_modules/bar/lib/thing.js" is the exact same as the content of the file at "node_modules/baz/node_modules/bar/lib/thing.js" even though they are different files.

josephwarrick avatar Feb 21 '17 18:02 josephwarrick

( 😨 Oh my goodness, I completely overlooked this issue, sorry for the late reply!)

To your issue, i was conscious of the fact that two files would result in the same cache because for various transforms I used, it is actually okay to not-reprocess the same file even if it changed location. For example:

module.exports = class MyClass {}

in two different places could be transformed by plugins to a backwards-compatible:

function MyClass() {}
module.exports = MyClass

is okay to used the cached version as replacement for both.

But you have an important point, there are transforms that very well could act differently depending on the location, as such it seems to me like it would be a good idea to have a flag that en/disables file-caching based based on location. (I tend to think that the default should be to use the file-path, so it would become an optimization option).

martinheidegger avatar Feb 15 '18 11:02 martinheidegger

Hey, rando popping in on the convo! Hope you don't mind.

Ran into this situation today, but the cause for concern was that the resulting cache file was sometimes getting garbled. Not sure how file locks work with fs.writeFile, but to me it seemed like two or more of my running bundlers were walking on the same/similar-content file at the same time...

Does that sound like something you could see unintentionally happening in this package?

loklaan avatar Feb 16 '18 02:02 loklaan

@loklaan Probably should be a separate issue. Never tried to have parallel bundlers running 😅

martinheidegger avatar Feb 16 '18 02:02 martinheidegger

My apologies, but my original issue wasn't very well written.

The issue @loklaan describes is the issue I was trying to fix. The reason

both [files] get cached to the same cache file

is a problem is because simultaneous writes to the same cache file results in a garbled file.

josephwarrick avatar Feb 16 '18 03:02 josephwarrick

@josephwarrick Thank you for clarifying this! The broken indexes are indeed an issue. Took the time to look into it and thought #6 might be a good idea.

martinheidegger avatar Feb 16 '18 07:02 martinheidegger