Possible to cause a panic with gsub
use lua_patterns::LuaPattern;
fn main() {
let mut m = LuaPattern::from_bytes(&[160]);
let s = "⚠";
s.bytes().for_each(|b| println!("{}" , b));
m.gsub(s , "");
}
The ⚠ is made up of the bytes [226 , 154 , 160] removing the 160 causes the string to become invalid unicode
Thanks for the report - yes, this kind of thing is always possible with arbitrary byte patterns. Not sure what the policy here should be.
I think we need gsub_bytes - there's already gsub_bytes_with.
Although what the format of the second replacement argument is - not yet sure!
I think that any function that handles strings should just be a wrapper around the one which handles bytes but then calls vec.into(). such as
gsub(&mut self , text: &str , repl: &str) -> String{
self.gsub_bytes(text.as_bytes() , repl.as_bytes).into()
}
this is a zero copy operation and only needs to scan for invalid bytes
also does the functions passed to gsub_*_with really need to return an owned type could they not return a &[u8] or &str
also does the functions passed to gsub_*_with really need to return an owned type could they not return a &[u8] or &str
We can do what the regex crate does, and return an appropriate Cow to optimize the case where no substitution took place.
I think that any function that handles strings should just be a wrapper around the one which handles bytes but then calls
vec.into(). such as
This is pretty much how gsub is implemented now - only really have to do the bytes case!
There is also gsub_checked which returns a Result, to avoid the panicking problem.