icu4x
icu4x copied to clipboard
Finalize Design around Data Safety
I hosted an unconference on data safety at RustConf (notes here).
I would like to write a document establishing the following:
- Whereas it is rare to be 100% confident about the safety of your data. As a thought experiment, if you were 100% confident, you could use
get_uncheckedand other unsafe operations. If you are not confident enough to use unsafe code, then you are not 100% confident; and - Whereas validating data invariants can be expensive, and we would like to minimize validation overhead. Moving validation into the algorithm (post-deserialization) often reduces the overall overhead of validation; and
- Whereas it is useful to be able to learn whether code traverses unexpected code paths; and
- Whereas end users of ICU4X algorithms are not in a position to reason about invalid data in otherwise-infallible terminal functions; and
- Whereas if an algorithm processing data uses GIGO, it does not increase the space that malicious actors could leverage. An attacker could construct data to produce a result they desire, regardless of whether unexpected operations end in GIGO; and
- Whereas algorithms that panic on malformed data actually increase the vulnerability space of an application, because it could enable attackers to perform denial-of-service attacks; be it
- Resolved that data structs should reduce the number of internal invariants by utilizing public fields; and be it
- Resolved that code should never panic at runtime based on invalid data; and be it
- Resolved that code paths only reachable by invalid data should use debug assertions.
This could be paired with a crate that adds a debug_unwrap_or function to Option and Result (via a trait) that panics in debug mode and puts in a default value in non-debug mode.
Agreed?
- [x] @Manishearth
(I will add more later after Manish approves)
Agreed.