Implement native terraform test code coverage generation
Terraform Version
Terraform v1.8.4
on linux_amd64
Use Cases
There doesn't seem to be a way to generate unit test coverage from a Terraform test. None of the current output options address coverage.
As HCL isn't an executable language where lines of code are run sequentially, I can see how this might be a hard-ish problem. However, it seems possible to mark a resource and/or attribute as "covered" if a test case evaluated it in the dependency graph of the test.
Attempted Solutions
The output options only report "passed, skipped, or failed" at the test case level. There's no concept of coverage.
Proposal
No response
References
No response
Hi @PT-GD,
Can you expand a little of what you imagine code coverage would entail? Because of the declarative nature of the configuration, Terraform effectively evaluates the entire configuration every time. In that sense everything (with certain minor exceptions) is covered, so we would need a new metric for what "code coverage" means in this context.
@jbardin : That's a great question, and I hope I have at least an acceptable response. For context, I know that Terraform creates a Directed Acyclic Graph of dependencies in order to create an executable plan. Coverage, then, is notionally reversing that graph from test assertions to the underlying HCL.
This could be quite coarse in an initial ("alpha") implementation to validate the approach: "A resource block is covered if it appears in the DAG ancestors of the elements of a test assertion."
It could become more granular as the capability matures: "An argument of a resource is covered if its value is used directly or indirectly in a test assertion."
Thanks @PT-GD, that's a fair way to think about it. That notion however does get more complicated as the configuration complexity grows, which means the granularity one might want from the coverage is currently infeasible. In order to produce a metric one can use as a measure of quality we need to state what the minimum guarantees are, and at best we can only guarantee that some attribute of maybe one resource instance somehow contributed, at least indirectly, to a value used by a test assertion. Maybe that minimum is useful to some? Perhaps while not a useful way to benchmark test quality, it could prove useful in writing tests just to flag resources you may have missed?
We already have this same problem with other analyses -- resources are referenced and composed within expressions as objects, and we cannot determine from the expression value how the result relates to the attributes of the original resource instances. This data provenance issue prevents a number of similar use cases, but we don't have any methods to deal with it yet, at least not without huge overheads in evaluation. While we already have a notion of "contributing attributes" for tracking the influence of external changes, it falls short in the same ways. The contributing attributes feature however isn't meant to be a reliable metric, but rather a hint of where to look if the data is of concern at all.
While a stepwise release of new functionality often works well, I think we need a clear plan of how to achieve the end result before ending up with a solution that can only take us partway there.
@jbardin I agree that there are only certain minor exceptions for configuration code which get evaluated conditionally but it becomes considerable code especially in complex modules with a lot of conditional expressions, count/for_each meta-arguments and dynamic blocks.
A code coverage report should include line and conditional/branch coverage to track if tests have led to evaluation of the certain code.
Line
- Nested block with
dynamicblock type - Top level block constructs with
count/for_eachmeta-argument
Conditional/Branch
- Conditional expression
- Boolean operator short-circuiting