herbie
herbie copied to clipboard
Better Cost Estimates & Free Optimizations
I was looking at the output of herbie for the example listed on the website, sqrt(x+1) - sqrt(x)
, and for the input range -1000 to +1000, the website test page output (x + (1 - x)) / (sqrt(x+1) + sqrt(x)
Herbie chose not to pursue the free optimization of (x + (1 - x)
to 1 because the input range was too small, which does seem reasonable for a trial based algorithm. However herbie also lists alternatives, which sacrifice accuracy for speed. I do feel that herbie should have pursued optimizing the numerator to 1 in the alternatives.
Also, I think the cost numbers for expressions are a bit off. Looking at the code in cost.rkt, it says that mul and add are a cost of 1, div is a cost of 1, and transcendentals are a cost of 100. From my experience and consulting the intel intrinsics guide (check entries for mul_sd, sqrt_sd, and div_sd), I think that a more reasonable estimate would be to have div's cost increased from 1 to 8 and transcendentals like sqrt and sin decreased from 100 to 15.
These estimates aren't strictly tied x86 in particular: floating point division is in general significantly more expensive than multiplication. On many platforms, and fairly commonly on gpus, x / y is approximated as x * rcp(y) for speed
Thanks for the issue report. There are basically two issues here, let me address them separately.
The (x + (1 - x))
: This is probably a bug. I'm not sure why Herbie doesn't pursue the simplification here but it could be due to e-graph soundness bugs that we've been working on a comprehensive solution for.
Cost estimates: Yeah, the cost estimate is pretty bad. I wouldn't mind tweaking them, and your suggestions look good. (I'm not sure about sin, because the intrinsics aren't the whole story there, but decreasing sqrt seems reasonable. Bumping division is also plausible.) I'd like to have a process for maintaining or evaluating these over time—I'll find a time to discuss this in our next team meeting sometime in the new year, but it sounds like something @bksaiki might have thoughts on.
I tried a whole lot of seeds and the x + (1 - x)
problem hasn't re-appeared. The better cost estimates are now available via the "beta" Platforms feature, so I'm going to close this PR.