vacuum icon indicating copy to clipboard operation
vacuum copied to clipboard

Output 'true' node path in report

Open obfu5c8 opened this issue 2 years ago • 8 comments

Given a custom rule such as

 common-responses-server-error:
  description: "Missing 500 response"
  severity: error
  category:
    id: operations
  given: $.paths.*.*
  then:
    field: responses.500
    function: truthy

When the report is output to the terminal, each infraction is listed correctly, however the path shown in the output is just the verbatim given from the rule - this makes it difficult to quickly grok which operation is the issue.

Line / Column | Severity | Message                                           | Path
(23:7)        | error    | Missing 500 response: `responses.500` must be set | $.paths.*.*
(34:7)        | error    | Missing 500 response: `responses.500` must be set | $.paths.*.*
(45:7)        | error    | Missing 500 response: `responses.500` must be set | $.paths.*.*
(56:7)        | error    | Missing 500 response: `responses.500` must be set | $.paths.*.*
(78:7)        | error    | Missing 500 response: `responses.500` must be set | $.paths.*.*
(100:7)       | error    | Missing 500 response: `responses.500` must be set | $.paths.*.*

Is it possible to have the output show the 'true' path to the offending node? (e.g. $.paths./foo.get instead of $.paths.*.*)

obfu5c8 avatar May 12 '23 06:05 obfu5c8

Hi,

Unfortunately there isn't currently a way for vacuum to know the 'true' path when using some core functions, the way the engine works is the nodes are extracted according to the json path lookup, and then evaluated, the path provided is the path the lookup happens on.

Once those nodes come in to the function, they are evaluated.

I think some better docs by help explain this concept a bit better.

Trying to capture the path of the evaluated could be added as a feature request however.

daveshanley avatar May 12 '23 12:05 daveshanley

@daveshanley This is very much necessary for my use-case - spectral does this so its a feature gap that is preventing me from switching over to Vacuum

AdrianMachado avatar Jul 04 '23 20:07 AdrianMachado

I am in the process of upgrading all the built in rules, to ensure paths are correct. This is being powered by a new product I am developing.

This use case however cannot be satisfied yet however, even with a much more powerful model underneath that knows the path to every single object, custom rules do NOT use this model (it's programatic).

This means that there is no way to know where in time and space the result is coming from, because custom functions use the JSON Path provided by the rule.

Custom rules have no concept of paths, because the code extracts raw nodes from the underlying document, based on the query provided.

The only way to extract the paths correctly, is to use the datamodel programmatically.

The only way to make this feature possible with custom rules, is to build a custom JSON Path parser and crawl the model looking for those nodes. This is a huge undertaking and well out of scope for the time I have spare for free work.

daveshanley avatar Dec 29 '23 12:12 daveshanley

@daveshanley Understood. I wonder how Spectral achieves this (maybe this is why they are so much slower haha)

AdrianMachado avatar Jan 31 '24 05:01 AdrianMachado

wondering given the fact that you have the line number if this could not be then back tracked ? ie going from the leaf and get back to the root (and so the path ) ...

LasneF avatar Jan 31 '24 08:01 LasneF

@daveshanley Understood. I wonder how Spectral achieves this (maybe this is why they are so much slower haha)

The library they have may or may not allow for reverse yaml traversal (you can walk backwards from any point in the AST). However the one we use does not, it's a one way walk down only, there is no parent reference to allow us to walk back up the tree.

https://pkg.go.dev/gopkg.in/yaml.v3#Node

There is no Parent property available for us to walk back up.

Once we drop down into a node after the search using the path.. we can only build from that point down, we have no idea where we came from. It's a one way journey.

The solution would be to update the library to add a Parent property to allow us to walk backward. We have done that with our programatic model for OpenAPI in libopenapi which is why the OWASP rules all now have accurate paths.

But, for this, we have no way forward, without modifying the underlying yaml library to support a two way walk.

daveshanley avatar Feb 03 '24 21:02 daveshanley

wondering given the fact that you have the line number if this could not be then back tracked ? ie going from the leaf and get back to the root (and so the path ) ...

As mentioned in the above comment, we can only walk one way, it's a one way tree, you cannot walk back up it.

https://pkg.go.dev/gopkg.in/yaml.v3#Node

daveshanley avatar Feb 03 '24 21:02 daveshanley

@daveshanley Understood. I wonder how Spectral achieves this (maybe this is why they are so much slower haha)

The library they have may or may not allow for reverse yaml traversal (you can walk backwards from any point in the AST). However the one we use does not, it's a one way walk down only, there is no parent reference to allow us to walk back up the tree.

https://pkg.go.dev/gopkg.in/yaml.v3#Node

There is no Parent property available for us to walk back up.

Once we drop down into a node after the search using the path.. we can only build from that point down, we have no idea where we came from. It's a one way journey.

The solution would be to update the library to add a Parent property to allow us to walk backward. We have done that with our programatic model for OpenAPI in libopenapi which is why the OWASP rules all now have accurate paths.

But, for this, we have no way forward, without modifying the underlying yaml library to support a two way walk.

Yeah that makes sense. Would be nice to have eventually, but I understand this would probably be a big undertaking and is out of scope

AdrianMachado avatar Feb 05 '24 16:02 AdrianMachado

There is a solution to this, inbound.

daveshanley avatar Jul 31 '24 22:07 daveshanley

There is a solution to this, inbound.

OMG amazing!

AdrianMachado avatar Aug 05 '24 16:08 AdrianMachado

Fixed in v0.12 https://github.com/daveshanley/vacuum/releases/tag/v0.12.0

daveshanley avatar Aug 05 '24 22:08 daveshanley