turbo icon indicating copy to clipboard operation
turbo copied to clipboard

Negative filter not respected for transitive dependencies

Open maschwenk opened this issue 3 years ago • 2 comments

What version of Turborepo are you using?

1.2.14

What package manager are you using / does the bug impact?

pnpm

What operating system are you using?

Linux

Describe the Bug

Turbo's filter does not seem to respect the same semantics as pnpm's. For instance, with a DAG like:

package_a 
   |
    -----> package_i_dont_want_to_build

This command works and brings no packages into scope

turbo --filter=package_i_dont_want_to_build --filter='!package_i_dont_want_to_build`

This command

turbo --filter=package_a --filter='!package_i_dont_want_to_build`

builds package_a and package_i_dont_want_to_build.

Expected Behavior

A negative filter should be absolute and --filter='!package_i_dont_want_to_build' should mean that that package is never built, regardless of the DAG. For reference, this does work when using pnpm's filters.

To Reproduce

Described in the bug

maschwenk avatar Jun 01 '22 18:06 maschwenk

turbo's filter syntax is used for selecting entrypoints into the dependency graph. You are indeed selecting the same set of packages as entrypoints as with pnpm, but turbo has an extra layer of dependency on top of what pnpm does: task dependency.

If you've said (via turbo.json) that building a package depends on building its dependencies, then when your combination of filters selects package_a as an entrypoint, turbo goes to see what is required to do that. It finds the dependency (expressed via package_a's package.json) on package_i_dont_want_to_build, and combines that with the request to build dependencies when building a package to reach the conclusion that it should build package_i_dont_want_to_build.

If you don't mind, can you tell me more about your use case? It appears that you have a dependency that you don't want built. Is there a particular reason to avoid building it? Is it an actual dependency? Perhaps it could be a peerDependency instead?

There is also the --only flag to explicitly not traverse the task dependencies. In that case, turbo will build package_a, but none of package_a's dependencies.

gsoltis avatar Jun 02 '22 21:06 gsoltis

@gsoltis I can elaborate on the use-case, I'm facing: We have package-a which depends on package-b. On our local machines we want package-b to build when building package-a but on CI, we want to issue a slightly different command to build package-b (we want to prebuilt our binary without debug symbols). So we want to issue the build command separately and simply skip building package-b to avoid it overriding the output.

I tried your suggestion to use --only but I can't get that to work:

// packages/a/package.json
{
  "name": "package-a",
  "scripts": {
    "build": "echo 'Building A'"
  },
  "dependencies": {
    "package-b": "*"
  }
}
// packages/b/package.json
{
  "name": "package-b",
  "scripts": {
    "build": "echo 'Building B'"
  }
}
// turbo.json
{
  "$schema": "https://turborepo.org/schema.json",
  "pipeline": {
    "build": {
      "dependsOn": [
        "^build"
      ]
    }
  }
}

Running:

npx turbo run build --only --filter=package-a --filter='!package-b' --dry-run

Yields:

Packages in Scope
Name      Path       
package-a packages/a 

Tasks to Run
package-b#build
  Task          = build                             
  Package       = package-b                         
  Hash          = 327ee10c7cd6abb7                  
  Directory     = packages/b                        
  Command       = echo 'Building B'                 
  Outputs       =                                   
  Log File      = packages/b/.turbo/turbo-build.log 
  Dependencies  =                                   
  Dependendents = package-a#build                   
package-a#build
  Task          = build                             
  Package       = package-a                         
  Hash          = c02824b4e8a12fa3                  
  Directory     = packages/a                        
  Command       = echo 'Building A'                 
  Outputs       =                                   
  Log File      = packages/a/.turbo/turbo-build.log 
  Dependencies  = package-b#build                   
  Dependendents =       

I wouldn't expect "package-b#build" in "Tasks to Run".

kraenhansen avatar Jun 29 '22 11:06 kraenhansen

@kraenhansen yeah only I believe is just for ignoring dependencies of pipeline commands, so if you had test depending on build you could skip running build when you call test for a one off.

@gsoltis The use case I'm looking at is developing a dev server. Right now we boot up a file watcher process that spawns a child process of the actual app server. We enumerate the dependencies of the app with pnpm, and then watch those file paths for changes. When there's a change to those paths, we tell pnpm to build the packages that have changed. We then restart the child process app server to pickup that new code (in the case of code changed in the app code itself, we do a hot reload instead of a restart, but that's besides the point). pnpm is strict and if you pass a package with a filter, it'll just build that package, and not care about its dependencies unless you actually add the ... e.g. --filter=blah.... Because almost all turbo workspaces will have:

 "build": {
    "dependsOn": ["^build"]
  }

that means there's really no way to prevent it from doing those dependent builds, even if they are cached. I guess because we're caching the dependent rebuilds shouldn't matter, because turbo should realize that if I'm asking it to rebuild a package, it shouldn't rebuild its children unless they've changed, it just seems like it'd be nice to not have to even evaluate the cache in those scenarios. Maybe we should just change our config to:

 "build": {
    "dependsOn": ["build"]
  }

and then when we really want to build dependents we can explicitly call that. Would love to jump in a call/zoom to go over this dev server use case, wondering if you have better options. We do some nonstandard stuff in our dev server like hot reloads and lazy-require patching that wouldn't jive well with just replacing it with a nodemon setup like I've seen in the docs. Just curious if there are better paved paths here for that use case.

maschwenk avatar Dec 23 '22 18:12 maschwenk

Ok looks like one way I'm able to achieve what I want is:

"pipeline": {
    "build": {},
    "buildAllWithDependents": {
      "dependsOn": ["^build"]
    }
},

and then I can choose when I want to have the flowdown builds happening vs not. Unsure of the implications of this, but I think should work for my usecase. Still curious if there are better recommendations for the dev-server use case.


One other use case we had was where we an E2E project depending on an app (our only app-level dependency) and this would mean if we called turbo build --filter=e2e it would build our app code, which we didn't want because we wanted it to run in a dev server mode, but that's sorta a secondary and more niche use-case. We've since decided to ban dependencies on apps.

maschwenk avatar Dec 23 '22 18:12 maschwenk

@kraenhansen noting that the above doesn't actually work, because calling turbo build as defined above will mean that the packages won't build their dependencies, but they also won't build in topological order. If you truly want to run things ignoring topological order, you can achieve this all a bit simpler by just using the parallel flag. Still trying to think through better options here.

maschwenk avatar Jan 06 '23 15:01 maschwenk

We would typically recommend building a separate dependency graph, possibly using synthetic tasks to accomplish this. Filtering at the command line is not an easily-shareable configuration mechanism, and it's way better to maintain within a declarative turbo.json.

nathanhammond avatar Aug 21 '23 16:08 nathanhammond