arbitrary icon indicating copy to clipboard operation
arbitrary copied to clipboard

Best way to generate float values in range

Open killercup opened this issue 3 years ago • 7 comments

Hi folks, in https://github.com/technocreatives/dbc-codegen/issues/12 we're generating implementations of Arbitrary for structs where we know some values have to be in a specific range. For ints, this is not a problem as we just use Unstructured::int_in_range, but I'm unsure how to proceed for floats. (We currently just go with always setting the float field to the lower bound.)

My initial idea was to keep generating floats until we get one in ourr range, but that seems awfully slow. Is there a good strategy for generating floats that are in a specific range? Anything I should look up? All I can find assumes an existing random number library is in place… which I'm hesitant to add just for this.

killercup avatar Mar 20 '21 11:03 killercup

I imagine the most straightforward way would probably be to generate unsigned integers and use one of the distributions to make them into a float (maybe taken from the rand crate?)

nagisa avatar Mar 20 '21 12:03 nagisa

I think it would be nice to have Unstructured::float_in_range as a sibling to Unstructured::int_in_range but I am not super confident in my ability to implement this function without bias. Or -- if I am being more forgiving of myself :) -- I would have to spend a bunch of time learning more about how to do this well, and I don't have those cycles right now :)

But I agree that this functionality is desirable!

fitzgen avatar Mar 22 '21 21:03 fitzgen

In the meantime, you can do something like this

let x: f64 = u.arbitrary()?;
// clamp `x` to `MY_MIN..=MY_MAX`
let x = min(x, MY_MAX);
let x = max(x, MY_MIN);

and ignore bias for now, instead hoping that libfuzzer will figure things out alright via its coverage feedback.

fitzgen avatar Mar 22 '21 21:03 fitzgen

Thanks for the replies, Nick! I might get back to this -- right now though I've implemented it by getting a random int and scaling it. This surely has a bias but it was done in 10min and suffices for my usage for now :)

killercup avatar Mar 23 '21 18:03 killercup

What does 'unbiased' mean in this case? All floating point bit patterns between the min and max are equally likely? Or something like @killercup/rand's where the distance between each of the possible values is the same but some bit patterns are not covered.

jrmuizel avatar Apr 07 '21 01:04 jrmuizel

To generate all the float bit patterns you could just to_bits() the bounds, generate random ints between those integer bounds and then from_bits() back to float.

jrmuizel avatar Apr 07 '21 02:04 jrmuizel

What does 'unbiased' mean in this case? All floating point bit patterns between the min and max are equally likely? Or something like @killercup/rand's where the distance between each of the possible values is the same but some bit patterns are not covered.

This is a good question, and I'm not sure which we should aim for!

fitzgen avatar Apr 08 '21 00:04 fitzgen