safety-dance
safety-dance copied to clipboard
Audit url
cargo-geiger
reports numerous usages of unsafe, though only 6 seem to be local to the url
crate itself.
Example unsafe usage in decode_utf8_lossy
A majority of the remaining unsafe usages consist of, e.g. unicode-normalization and smallvec (#9) Absolutely no idea if any of these unsafe cases are appropriate or not.
The url crate seems to be widely used enough to check, at least.
Just looked at the example unsafe block you linked. I think it is safe, and potentially appropriate, but I believe there is a way it can be re-written in safe code. I'll review the rest tomorrow and submit a PR if needed.
Submitted the PR - https://github.com/servo/rust-url/pull/560
The comments look very nice and clear to me. Thanks for doing this!
I think one more unsafe block could be trivially rewritten as safe code, I've left a comment on the PR.
This leaves only two iterator implementations that return Some(&str)
and I think those can be converted to returning Some(char)
instead to avoid the unsafe conversions, with something like this:
mystring.chars().flat_map(|c| if should_percent_encode(c) {percent_encode(c).chars()} else {c})
@Shnatsel thanks for the feedback. I've removed the unsafe block as you suggested.
This leaves only two iterator implementations that return Some(&str) and I think those can be converted to returning Some(char) instead to avoid the unsafe conversions, with something like this..
If I'm understanding correctly, these would be changes which change the public interface?
Those structs seem to be private, so I think that wouldn't be a breaking change.
https://github.com/servo/rust-url/blob/master/src/form_urlencoded.rs#L125
This is one example of an iterator with an unsafe block. But ByteSerialize
is marked as public in the form_urlencoded
module, then src/lib.rs
has pub mod form_urlencoded
. I believe the other case of this is similar. Are you referring to some other structs? Or am I misinterpreting the Rust public/private rules?
Nah, it's probably me who misinterpreted the rules. Those structs indeed seem to be public, so changing the return type would indeed be a breaking change. Apologies for the confusion.
In addition, I figured we might want to check idna
and percent_encoding
as well, but wasn't sure how those should be handled.
i.e., I'm assuming we're doing a PR per project (e.g. rust-url/url
, rust-url/idna
), vs a PR per repo (e.g. rust-url/*
)?
PRs are easier per project. Tracking issues... dunno, per repo probably, because that's what I'd grep for unsafe
For what its worth, my PR covers the full rust-url repo (but not dependencies).
For what its worth, my PR covers the full rust-url repo (but not dependencies).
Yup. Just looked through it a minute ago and realized it was comprehensive.
Regardless, I'll keep this in mind moving forward, and include checklists for bigger projects in case we need to break things up.
Thanks @Shnatsel and @JoshMcguigan!