simplecov icon indicating copy to clipboard operation
simplecov copied to clipboard

Adapt serialization/resultset mechanism (problems recovering branch coverage data)

Open PragTob opened this issue 6 years ago • 3 comments

JSON serialization/parsing works fine for line coverage, however for branch coverage the structure we have is that of an array that is the key pointing to values. Sadly, that breaks when dumping and parsing from JSON:

irb(main):031:1* map = {
irb(main):032:1*   [:if, 0, 3, 11, 3, 33] =>
irb(main):033:1*   {[:then, 1, 3, 23, 3, 27] => 1, [:else, 2, 3, 30, 3, 33] => 0
},
irb(main):034:1*   [:if, 3, 6, 6, 10, 9] =>
irb(main):035:1*   {[:then, 4, 7, 8, 7, 12] => 0, [:else, 5, 9, 8, 9, 11] => 1}
irb(main):036:0> }
irb(main):037:0> JSON.parse JSON.dump map
=> {"[:if, 0, 3, 11, 3, 33]"=>{"[:then, 1, 3, 23, 3, 27]"=>1, "[:else, 2, 3, 30, 3, 33]"=>0}, "[:if, 3, 6, 6, 10, 9]"=>{"[:then, 4, 7, 8, 7, 12]"=>0, "[:else, 5, 9, 8, 9, 11]"=>1}}

Which needs to be solved via either regexp based parsing or evaling strings (currently on my branch) which is medium fine as we read our own output but doesn't sound quite great.

This particula scenario does work when using YAML though:

irb(main):038:0> YAML.load YAML.dump map
=> {[:if, 0, 3, 11, 3, 33]=>{[:then, 1, 3, 23, 3, 27]=>1, [:else, 2, 3, 30, 3, 33]=>0}, [:if, 3, 6, 6, 10, 9]=>{[:then, 4, 7, 8, 7, 12]=>0, [:else, 5, 9, 8, 9, 11]=>1}}

However, YAML support was removed way back in 0.5.0. I haven't been around then so I don't know about the errors caused there and if that's still relevant.

Right now there is a third problem, namely we parse symbolizing keys which creates tons of symbols just to transform them to strings again for evaling/parsing.

Anyhow, it's a problem. Possible solutions with different levels of "solving it":

  • stop symbolizing keys (minimum requirement)
  • evaluate storing YAML again instead of JSON
  • store a format that is further away from the original coverage data but that is easier to parse again

@colszowka input especially on the YAML thing rather welcome :)

#781

PragTob avatar Dec 29 '19 14:12 PragTob

I solved it by just casting hashes to arrays in my fork (https://github.com/umbrellio/simplecov/blob/add-branch-and-method-coverage/lib/simplecov/result_serialization.rb).

So basically stuff like { [:x, 1, 2, 3] => [:y, 4, 5, 6] } gets converted to [[:x, 1, 2, 3], [:y, 4, 5, 6]], which can be safely dumped to JSON.

tycooon avatar Dec 30 '19 11:12 tycooon

@tycooon ah that's a nice idea, thank you very much! :tada:

PragTob avatar Dec 30 '19 19:12 PragTob

For reference, we stopped symbolizing all keys. I'm unsure what to do regarding YAML vs JSON and the tip to store them as arrays instead of evaling them.

I guess/think that all of the storage/noramlization is up for some work as way too many parts just do JSON.parse(JSON.dump(thing)) which is overhead we should be able to reduce through some normalization.

(Albeit in the end it could end up as being slower with smaller normalization as JSON might be C-backed while our normlaization won't be ;) )

PragTob avatar Jan 23 '20 14:01 PragTob