rust
rust copied to clipboard
Language vs. implementation threat models and implications for TypeId collision resistance
From #129014 and #10389
If you instead want to argue that the lang team should change its mind, please open a new issue and gather arguments in favor of that position, so that a summary of all the arguments for either option can be brought to the lang team for discussion.
I think we should take a step back and take a look at the bigger picture. I have tried to get people to spell out their threat model so that we can apply it as a razor to close/postpone all bug reports that are outside that model rather than making incoherent attempts to fix some things but leaving similar or more easily exploitable things open for the foreseeable future.
Previously the lang-team decided that it would be sufficient to use a 256bit cryptographic hash. If we assume – as #129014 does – that that statement was also meant as a lower bound required to satisfy T-lang's threat model then this has implications about what kind of things a language implementation should defend against. But afaict the rationale was not documented, so it remains unclear exactly against which threats an implementation should defend.
On the other hand it has been said in several places – e.g. in the context of build script sandboxing ¹ ², I-unsound issues, LLVM bugs and possibly incremental compilation – that rustc currently does not and cannot promise to be robust against malicious inputs.
So it seems like things are underspecified and there's tension between what the language should ideally offer and what implementations can offer. This in turn raises the question how the gap between the upper bound of what the spec allows/wants vs. what implementations offer should be handled.
It's also unclear to me to which extent this is a lang vs. a compiler decision.
Now to the concrete case of TypeId:
At first blush it seems that as long as the compiler does not try to resist malicious inputs on many fronts (proc macros, build.rs, compiler bugs, language bugs, llvm bugs) it does not make much sense that an actual TypeId implementation should spend much effort defending against that. And given the de-facto security levels that the compiler can offer a 128bit hash and only accounting for non-malicious inputs should remain sufficient for the moment, which would also be consistent with previous T-compiler decisions.
There might be some arguments that TypeId is different from other holes, e.g. as @briansmith tried here to gesture at some difficulty to detect exploits but there were counterpoints regarding the reliability and the possibility of such a scenario and we failed to reach agreement in subsequent discussion.
Imo the question What makes TypeId sit on one side of the security fence while other things sit on the other side? remains unanswered.
Perhaps it would be better if on the language level it is specified as globally unique value/exact comparison without mandating a particular implementation. That would leave the probabilistic reasoning, choice of comparison method etc. to the compiler. It could then decide to switch to a stronger implementation when it makes sense along with other hardening efforts.