Docs: Clarify behaviour of `--affected` flag, and whether it takes `inputs` into account
What is the improvement or update you wish to see?
I've just run into an issue where the --affected flag does not take a task’s inputs into account, which I think could be made clearer in the documentation. In my case, a graphql schema in a parent directory should be taken into account for a codegen task, as it is part of the task's inputs, and therefore cause caches misses in all tasks that depend on it. When adding --affected to my CI runs, I noticed that a change to the schema file doesn't cause the codegen tasks to run, as it is not deemed to be affected by the change (nor are any tasks), and as a result CI passes even though a type error is subsequently introduced.
Looking more closely at the documentation, I see a mention of "packages with code changes", which could suggest that only changes within the packages, and not their inputs, are considered? If this is the case, I think the docs could be made a lot clearer on the point.
Is there any context that might help us understand?
I considered whether this should be a bug report or a documentation request, but given that the language is a little unclear, I decided on the latter.
Does the docs page already exist? Please link to it.
https://turbo.build/repo/docs/crafting-your-repository/constructing-ci#running-only-affected-tasks
Thanks for the issue! To clarify, this is a mismatch between hashing and package filtering, two different parts of the Turborepo execution model.
When Turborepo receives a --filter flag (--affected is just a filter flag under the hood), it uses that filter flag to compute the set of packages that it needs to build. For a git range, that is the set of packages that are changed within that range. Note that this is a set of packages, so we cannot take into account inputs at all, since those are defined at the task level. Once that set of packages is computed, we build the package dependency graph, and then the task dependency graph. When it comes to actually executing the tasks, we compute a task hash based off of various info, such as the hashes of the inputs provided. We then use that hash to see if the task has already been cached, and restore from cache if so.
In short, we can't take into account inputs because those are task inputs and --filter/--affected works at the package level. As a workaround, you could try to put the schema in a package, say @foo/schema and have the other packages depend on that package. When the schema changes, those packages will all be invalidated with --affected
Thanks for the clarification. I see the documentation mostly uses terminology around packages, but I think it would also help if the CLI option description in turbo run --help could also be updated with the distinction in task-level and package-level filtering.
-Run only tasks that are affected by changes between the current branch and `main`
+Filter to only packages that are affected by changes on the current branch and `main`
Seems the root cause here is already tracked in #4678.