plot/plotter: NewHist(binPoints) panic when a small negative number is used
What are you trying to do?
I am trying to analyze the distribution of a series of numbers by plotter.NewHist
What did you do?
Here is a minimal example of the code I used:
func TestA(t *testing.T) {
data := []float64{203.4, -9.223372036854776e+17, 220.4}
_, err := plotter.NewHist(plotter.Values(data), 10)
if err != nil {
fmt.Println(err)
}
}
What did you expect to happen?
some bin are returned and there is no error or panic
What actually happened?
funcion binPoints panics with following message panic: 203.4, xmin=-9.223372036854776e+17, xmax=220.4, w=9.223372036854779e+16, bin=10, n=10
What version of Go and Gonum/plot are you using?
Go 1.23 gonum.org/v1/plot v0.15.0
Also reproduced in Go 1.20 gonum.org/v1/plot v0.0.0-20190515093506-e2840ee46a6b
Annotation
When debugging, I find the bin is set to 10 wrongly. I doubt it is the computing accuracy problem for float64. Maybe if x == xmax (line 213) should be changed to if bin >=n?
This is working as intended; bins cannot hold negative values. You will need to sanitize the data you provide to the call.
This is working as intended; bins cannot hold negative values. You will need to sanitize the data you provide to the call.
Sorry for my imprecise description. The fact is I observed a continuous and significant number of panics in production. At the current stage, I suspect bin := int((x - xmin) / w) may cause floating-point calculation error even for positive data. And if x == xmax (line 213) cannot make bin < n in all cases. I will try to collect more logs and provide more details in the future.
On another note, could you please list relevant document or comment for the fact that bins cannot hold negative values? I have read https://pkg.go.dev/gonum.org/v1/plot/plotter?utm_source=godoc#NewHistogram but it seems there is no limitation description for negative values. Thanks!
The documentation for NewHistogram says "Each y value is assumed to be the frequency count for the corresponding x." It is not meaningful for a value to be negative here.
Sorry, I've taken a closer look and essentially you've hit a limit of float64 precision. I'll see if there is anything we can do about it.
The issue is that (x-xmin)/w becomes n for your range. The bin calculation depends on all values lower than the max to be some epsilon smaller than the max, but this does not happen here. We could alter the max bin allocation logic at L213-L215, but that would be brittle. The reality here is that you are expecting too much of floats and should try to live within the 1e-16—1e16 range.
The issue is that
(x-xmin)/wbecomesnfor your range. The bin calculation depends on all values lower than the max to be some epsilon smaller than the max, but this does not happen here. We could alter the max bin allocation logic at L213-L215, but that would be brittle. The reality here is that you are expecting too much of floats and should try to live within the 1e-16—1e16 range.
Thanks for your reply! I fully agree it is due to the precision limits of float64. Staying within 1e-16—1e16 is rational and it works in more cases. I think it might be better if there is any description or tip about it in documentation like NaN, especially if there are no plans to change it.
I think the best we can do is improve the information in the panic. Fully describing the limitations of IEEE 754 is out of scope.