nft icon indicating copy to clipboard operation
nft copied to clipboard

Add option to allow app directory globs from packages

Open hybrist opened this issue 6 years ago • 13 comments

For things like automatic loading of translation files, it may be nice to allow globs of files even when they are not within the package boundary. Right now the following fails if it's part of a module:

const en = fs.readFileSync(path.resolve('locales/en.json');

In my specific case, I was also interested in having an installed module be able to preserve a directory by doing:

fs.statSync(path.resolve('app-data-dir'));

But there may be more examples, like automatically loaded config files from the app/working directory.

hybrist avatar Nov 20 '19 16:11 hybrist

even when they are not within the package boundary

I never really understood why the package boundary was necessary. This was something @guybedford was very keen on adding. It also matches the behavior of ncc.

However, I don't understand how your two examples are outside the package boundary? Perhaps you could add a unit test to show it fail for your use case?

styfle avatar Nov 20 '19 17:11 styfle

The actual case where I ran into it was:

// HTTP handler for dynamic asset assembly.
// It reads `.next/chunks`.
node_modules/next-dynamic-bundle/dist/handler.js

// The actual chunk data
.next/chunks/*.js

NFT recognizes the reads in handler.js but discards them because they are for a directory outside of the package.

Code for the handler where it stats the chunk dir (which should cause a glob include of all files since it's a directory): https://github.com/azukaru/progressive-fetching/blob/bcc665cbf30934d3570c20d18c7860c658fe40b2/packages/next-dynamic-bundle/src/handler.ts#L20

hybrist avatar Nov 20 '19 17:11 hybrist

The line you linked to is reading a dynamic path fs.statSync(chunkDir). Can you try changing it to fs.statSync(path.resolve('.next', 'static', 'chunks')) and see if that works?

styfle avatar Nov 20 '19 19:11 styfle

I tried a couple of variants. Debugging with the CLI directly showed that the glob is being done but then it is followed by "Ignoring X because it is outside of the package". So I'm pretty sure it's finding the assets successfully.

hybrist avatar Nov 20 '19 20:11 hybrist

Have you tried setting the base option?

You could potentially allow everything with

const { fileList } = await nodeFileTrace(files, {
  base: '/'
}

https://github.com/zeit/node-file-trace/blob/master/readme.md#base

styfle avatar Nov 20 '19 20:11 styfle

The issue is that pkgBase is dynamic and afaict isn't the same as base. The condition I'm running into is this one: https://github.com/zeit/node-file-trace/blob/94a64aae0a4be8368717852de805f3627381a8e0/src/analyze.js#L735-L741

hybrist avatar Nov 20 '19 20:11 hybrist

That condition can be changed to something more appropriate. The main thing it's trying to protect against is an invalid analysis inadvertantly globbing all of node_modules and defeating the point of tracing. So it's very much about balancing those things.

I'm not sure the best way to extend / adjust the logic here, it may come down to some trial and error on the analysis paths as well.

guybedford avatar Nov 21 '19 21:11 guybedford

Suggestion for a revised logic: If the path starts with the overall base and the remainder doesn't include /node_modules/ at all, allow it? In addition maybe: The remainder has to start with a path segment that isn't node_modules. That should prevent accidentally including huge directory trees.

hybrist avatar Nov 21 '19 22:11 hybrist

The specific case to be weary of is base/node_modules/pkg/node_modules/dep having a statement like fs.readFile(__dirname + '../../' + unknown) causing a glob of base/node_modules/pkg/.

The node_modules parsing itself is kind of a lazy way to do the package boundary - a package.json scope check could be another way to do the same thing. But the issue remains either way I think?

My concern is any logic to relax this will let the problematic case slip through.

guybedford avatar Nov 21 '19 22:11 guybedford

I think the specific hole I'd like to punch is: Allow a glob relative to the project directory that "definitely" (minus symlinks) doesn't include node_modules. Your example should be rejected because the wildcard expression would include node_modules in the glob remainder.

To elaborate on my proposal, in semi pseudo code:

if (pkgBase) {
   const nodeModulesBase = id.substr(0, id.indexOf(path.sep + 'node_modules')) + path.sep + 'node_modules' + path.sep; 
   if (!assetPath.startsWith(nodeModulesBase)) {
     // Allow after all if it's an app-level glob
     const appSubGlob = assetPath.substr(base.length + 1);
     const segments = appSubGlob.split(path.sep);
     if (
       !assetPath.startsWith(base) ||
       segments.length < 2 ||
       isGlobOrNodeModules(segments[0])
     ) {
       return false; 
     }
   } 
 } 

hybrist avatar Nov 21 '19 22:11 hybrist

Yes that sounds like it would work! If we can guarantee that whatever is being dived into is not a node_modules folder, then it must have been a folder the package knows exists below itself so is likely the correct intention.

Note the other thing to still be weary of is backtracking below the base (security risk), but that is still covered if there is a change of logic to this part since that check happens earlier I believe.

You happy to go ahead with a PR there further?

guybedford avatar Nov 21 '19 22:11 guybedford

May find some time to get a PR together for this. Thanks for the buddy-check on the overall approach!

hybrist avatar Nov 21 '19 23:11 hybrist

I also would like this feature. Blitz framework code (inside node_modules) is reading config files in the app directory at runtime. Currently that is not detected by NFT and the config files are eliminated.

flybayer avatar Jul 23 '20 13:07 flybayer