h3 icon indicating copy to clipboard operation
h3 copied to clipboard

Add range checks

Open alexey-milovidov opened this issue 4 years ago • 5 comments

Implement https://github.com/uber/h3/issues/392

Not sure if everything is correct, I did not dig into the algorithms. But at least it fixes https://github.com/ClickHouse/ClickHouse/issues/19219

There should not be significant performance degradation for two reasons:

  • the branch is easy predictable and it can be executed in parallel due to instruction level parallelism;
  • the algorithms are already heavy enough, the difference in a few clock ticks should be neglible.

alexey-milovidov avatar Jan 21 '21 18:01 alexey-milovidov

Coverage Status

Coverage decreased (-0.7%) to 98.516% when pulling af8068a99f3193ee050e3da81cc40f30a92e9d2e on ClickHouse-Extras:clickhouse-range-checks into dd0b1e554fec63fc4698fec98445fddcbbecc903 on uber:master.

coveralls avatar Jan 21 '21 18:01 coveralls

I see there are some issues with formatting. I don't think I'm ready to finish this PR, feel free to apply any edits or close and resubmit another PR...

alexey-milovidov avatar Jan 21 '21 19:01 alexey-milovidov

Thanks for submitting this! You should be able to fix the test failures with make format. The bigger issue here is probably test coverage - we'll need to add tests for the new branches introduced here.

@isaacbrodsky - thoughts on bringing this into 3.x? This is potentially a security issue.

nrabinowitz avatar Jan 22 '21 00:01 nrabinowitz

I haven't been following the issues daily so sorry for being late to the conversation again...

The lack of range checks on arrays throughout H3 core (including the recent baseCells array merge) was intentional (as it is in the C language itself). Personally I think that performance degradation in H3 core should be avoided at all costs. The hit might be small for each change, but such things add up. Rather than changing the core functions, I would strongly encourage you to add "safe" wrapper functions that can be used in situations where the user doesn't want the onus of checking their own array indexes when necessary.

sahrk avatar Jan 22 '21 01:01 sahrk

To @sahrk' point, another suggestion posted in the issues is to expand the base cells array and ensure we never calculate an index into that array outside of 0-127 inclusive.

I should point-out that the baseCells and the faces in faceIjkToBaseCell are not the only arrays that are accessed without bounds checking in the core. There are a whole bunch of arrays in the core, many with multiple dimensions. If I ever did bounds checking on any of them it was by mistake :).

The core was designed to be as tight as possible, which is why so much of it is table lookups, which are as cheap as it gets in C. In the case of the baseCells we're talking about a function that's just a wrapper for an array access. However you slice it, if you add bounds checking you're at least doubling the cost of the function.

I would like to ensure that H3 is resilient to dereferencing out of bounds. Beyond the security concerns, this type of crash is often very difficult to debug, especially when it happens in a library loaded into a manageed framework. I would be willing to create a new 3.x release to incorporate this kind of change.

That is all reasonable of course. Earlier I suggested making safe versions of table lookup functions that are exposed to external users (like baseCells), but what if you made the API functions safe and created internal functions without bounds checking for use just in the core?

sahrk avatar Jan 22 '21 05:01 sahrk