berry icon indicating copy to clipboard operation
berry copied to clipboard

[Feature] Add a hook before yarn add

Open staldera opened this issue 1 year ago • 1 comments

  • [ ] I'd be willing to implement this feature (contributing guide)
  • [x] This feature is important to have in this repository; a contrib plugin wouldn't do

Describe the user story

As a developer I want to run a vulnerability check and some extra checks BEFORE adding a package with yarn add (eg: run a tool such as https://github.com/lirantal/npq). Thus, I looked into hooks, but I could only find one that's called before add, and it's validateProject. Sadly this one is called very often and doesn't contain the original command in the parameters (so no way to know if it's called as part of yarn install or yarn add or yarn <something else>).

Describe the solution you'd like

I'd like to have a hook that is called before adding a package, and has a parameter that gives me the packages that are supposed to be added. This way I can run vulnerability checks (eg: yarn npq add <package1> <package2> --dry-run) and proceed (if no problems) or signal a problem/exit/throw.

Having a report parameter such as there is in multiple existing hooks would be nice, because that'd allow me to use it to stop the current command cleanly if a vulnerability was found.

Ideally, I also would like to add a parameter such as --force to proceed despite warnings, but that might be an issue for yarn add command itself. Willing to do some extra magic myself to handle this case (eg: using process.env vars or a not-so-nice syntax such as yarn add force <package1> <package2>)

Describe the drawbacks of your solution

User-wise, I think there's no drawbacks, as you wouldn't need to use the hook if you don't need it.

Describe alternatives you've considered

I considered other alternatives, such as

  • Replacing the add command completely by a custom one: Not possible AFAIK, not with scripts in package.json, nor with a command based on BaseCommand. If it was possible, that'd also fit my use-case though.
  • Adding a custom command (eg: addsafe) and blocking the add command (through an error message): AFAIK also not possible to block a command currently. Would be less convenient than the first option, and might not work, since in the end (if the error vulnerability report is ok) I still want to add the package.

staldera avatar Mar 05 '24 10:03 staldera

I would also like this

I'd like to check that yarn add yarn or yarn install are being run with the the package.json packageManager version that is specificied, so I don't run into immutable errors on CI when the yarn.lock ends up changing.

jonsherrard avatar Oct 11 '24 23:10 jonsherrard

There are the afterWorkspaceDependencyAddition and afterWorkspaceDependencyReplacement hooks which runs after the workspace manifest data is updated but before running project.install(). No report is passed to those hooks but you can throw to abort the addition/replacement. If you do need a report you can start one yourself inside the hook or store the issues outside the hook and report them in validateProject

clemyan avatar Mar 26 '25 14:03 clemyan

Hi, thanks a lot for the feedback! I managed to do most of what I wanted to do with those two hooks and storing the results outside the hook to report them in validateProject, but the idea to start a report inside the hook sounds a bit cleaner to me, how would I do that?

staldera avatar Apr 01 '25 11:04 staldera

You can just new one of the Report classes (e.g. StreamReport), or use something like StreamReport.start which passes a report to a callback. See installWithNewReport for an example.

clemyan avatar Apr 01 '25 18:04 clemyan

I managed this:

          async afterWorkspaceDependencyAddition(workspace, target, descriptor) {
            const report = await StreamReport.start({
              configuration: workspace.project.configuration,
              json: false,
              stdout: Cli.defaultContext.stdout,
              forceSectionAlignment: true,
              includeLogs: true,
              includeVersion: true,
            }, async report => {
              const npqIssues = checkDependency(descriptor.scope, descriptor.name, descriptor.range, report)
              if (npqIssues) {
                if(process.env.NPQ_FORCE_INSTALL) {
                  report.reportWarning(MessageName.DANGEROUS_NODE_MODULES, `Issues were found with some package(s), but force mode is set. Proceeding:\n${npqIssues}`)
                }
                else {
                  report.reportError(MessageName.DANGEROUS_NODE_MODULES, `Issues were found with some package(s), use yarn add-force [package(s)] or yarn up-force [package(s)] if you still want to proceed:\n${npqIssues}`)
                }
              }
            });
          },

However, unlike when I use reportError in validateProject, the process doesn't exit. Is there something I missed or is that intentional?

staldera avatar Apr 04 '25 08:04 staldera

Closed the issue by accident, sorry :(

staldera avatar Apr 04 '25 08:04 staldera

Yes. The "report" you get in install hooks (e.g. validateProject) is a special one that aborts the install if reportError is called. In afterWorkspaceDependencyAddition, you need to actually throw to abort. For StreamReport you can check report.hasErrors() to see if reportError has been called.

clemyan avatar Apr 07 '25 16:04 clemyan

OK, then I'll use env vars, as it allows to list issues from multiple dependencies (eg: when running yarn add dep1 dep2) rather than crashing immediately.

Thanks a lot for the support, the issue can now be closed. Is there a donation link for the project by any chance?

Also, here's the code I used if this can be helpful to anyone trying to use NPQ inside yarn hooks. It's far from perfect, but it's doing the job for now, and can easily be adapted.

module.exports = {
  name: `hooks`,
  factory: require => {
    const {MessageName} = require(`@yarnpkg/core`)
    const {spawnSync} = require(`child_process`)
    const t = require(`typanion`);

    const checkDependency = (scope, name, range) => {
      const descriptor = scope ? `@${scope}/${name}@${range}` :`${name}@${range}`
      const result = spawnSync(`yarn npq add ${descriptor} --dry-run`, {
        encoding: 'utf-8',
        shell: true,
      })
      const packageIssuesHeader = "Detected possible issues with the following packages:"
      const npqIssues = result.stdout.split(packageIssuesHeader)[1]
      if(!!npqIssues) {
        return npqIssues
      }
      return undefined
    }

    return {
      default: {
        hooks: {
          async afterWorkspaceDependencyAddition(workspace, target, descriptor) {
            const npqIssues = checkDependency(descriptor.scope, descriptor.name, descriptor.range)
            if(npqIssues) {
              if (!process.env.NPQ_ISSUES) {
                process.env.NPQ_ISSUES = ""
              }
              process.env.NPQ_ISSUES += npqIssues
            }
          },
          async afterWorkspaceDependencyReplacement(workspace, target, oldDescriptor, newDescriptor) {
            const npqIssuesNew = checkDependency(newDescriptor.scope, newDescriptor.name, newDescriptor.range)
            if(npqIssuesNew) {
              const npqIssuesOld = checkDependency(oldDescriptor.scope, oldDescriptor.name, oldDescriptor.range)
              const oldLines = npqIssuesOld.split(/\r?\n|\r|\n/g).map(line => line.trim()).filter(line => line.startsWith('-'))
              const newLines = npqIssuesNew.split(/\r?\n|\r|\n/g).map(line => line.trim()).filter(line => line.startsWith('-'))
              const hasNewIssues = new Set([...oldLines, ...newLines]).size > oldLines.length
              if(hasNewIssues) {
                if (!process.env.NPQ_ISSUES) {
                  process.env.NPQ_ISSUES = ""
                }
                process.env.NPQ_ISSUES += npqIssuesNew
              }
            }
          },
          validateProject(project, report) {
            const isTrue = t.isOneOf([
              t.isLiteral(true),
              t.isLiteral("true"),
              t.isLiteral("True"),
              t.isLiteral("1"),
              t.isLiteral(1),
            ])
            if(process.env.NPQ_ISSUES) {
              if(isTrue(process.env.NPQ_STRICT_MODE)) {
                report.reportError(MessageName.DANGEROUS_NODE_MODULES, `Detected possible issues with some package(s). Strict mode is set, change env var NPQ_STRICT_MODE if you want to proceed:\n${process.env.NPQ_ISSUES}`)
              }
              else {
                report.reportWarning(MessageName.DANGEROUS_NODE_MODULES, `Detected possible issues with some package(s). Strict mode isn't set, proceeding:\n${process.env.NPQ_ISSUES}`)
              }
            }
          }
        },
      }
    }
  }
}

staldera avatar Apr 15 '25 09:04 staldera

A few comments on your implementation:

  1. You can use a shared variable to pass values between hooks via closure instead of env vars
  2. It currently passes the check if npq is not installed
  3. It is better to use an actual configuration to control strict mode

Here is an enhanced version that takes care of most the low-hanging fruits.:

module.exports = {
  name: 'yarn-plugin-npq',
  factory(require) {
    const { MessageName, structUtils } = require('@yarnpkg/core');
    const { spawnSync } = require(`child_process`)

    const checkDependency = (descriptor) => {
      const result = spawnSync(`yarn run -B npq add ${structUtils.stringifyDescriptor(descriptor)} --dry-run`, {
        encoding: 'utf-8',
        shell: true,
      });
      if (result.status !== 0) {
        return "  Unable to invoke npq. Please make sure it is installed";
      }
      const packageIssuesHeader = "Detected possible issues with the following packages:";
      return result.stdout.split(packageIssuesHeader)[1] || undefined;
    };

    let issues = '';
    const plugin = {
      configuration: {
        npqStrictMode: {
          type: `boolean`,
          default: false,
          description: `Abort install if npq detects issues`,
        },
      },
      hooks: {
        async afterWorkspaceDependencyAddition(workspace, target, descriptor) {
          issues += checkDependency(descriptor) ?? '';
        },
        async afterWorkspaceDependencyReplacement(workspace, target, oldDescriptor, newDescriptor) {
          const npqIssuesNew = checkDependency(newDescriptor)
          if (npqIssuesNew) {
            const npqIssuesOld = checkDependency(oldDescriptor);
            const oldLines = npqIssuesOld?.split(/\r?\n|\r|\n/g).map(line => line.trim()).filter(line => line.startsWith('-')) ?? [];
            const newLines = npqIssuesNew.split(/\r?\n|\r|\n/g).map(line => line.trim()).filter(line => line.startsWith('-'));
            const hasNewIssues = new Set([...oldLines, ...newLines]).size > oldLines.length;
            if (hasNewIssues) {
              issues += npqIssuesNew;
            }
          }
        },
        validateProject(project, report) {
          if (issues) {
            if (project.configuration.get(`npqStrictMode`)) {
              report.reportError(MessageName.UNNAMED, `Detected possible issues with some package(s). Strict mode is set, aborting:`);
              issues.split('\n').forEach(line => report.reportError(MessageName.UNNAMED, line));
            } else {
              report.reportWarning(MessageName.UNNAMED, `Detected possible issues with some package(s). Strict mode isn't set, proceeding:`);
              issues.split('\n').forEach(line => report.reportWarning(MessageName.UNNAMED, line));
            }
          }
        }
      },
    }

    return plugin;
  }
}

There are some more enhancement opportunities if you or anyone wants to pursue them:

  1. npq being invoked synchronously means adding many packages can cause the process to appear to hang without any output. Maybe we can spawn the process in the dependency hooks and then wait for them in validateProject?
  2. Is there a better way to ensure npq is installed than this?
  3. Is there a way to invoke npq without the overhead of going through the shell each time? Maybe only collect descriptors in dependency hooks and only call npq once in validateProject? Or maybe contribute a PR to npq to enable programmatic using and bundle it with the plugin?
  4. npq doesn't understand all types of descriptor that Yarn understands, presumably only npm: descriptors. Looks like it does not even fully understand all types of semver ranges. It is probably nice if we can limit what dependencies we pass to npq.

clemyan avatar Apr 16 '25 13:04 clemyan