plot
plot copied to clipboard
Replace plotter.XYs with slices
In my experience, most data exists as x, y []float64, and not as []struct{X,Y float64}. plotter should replace these types
See discussion https://groups.google.com/d/msg/gonum-dev/FrGpgA-pZQs/FCy8op2JCwAJ
Now if we had tables...
wrt tables, as far as gonum/plot
is concerned, everything is a float64
.
so, perhaps we could get away with a float64
-based ndim array?
but then, is it really the best option to couple gonum/plot
with that specific implementation?
should we give @Kunde21's numgo.Array64
a go?
relying on []float64
seems simpler. row-wise data isn't very cache friendly anyways...
That was not a serious suggestion.
Why not just add the code:
type XYArray struct {
X, Y []float64
}
func (xy XYArray) Len() int {
return len(xy.X)
}
func (xy XYArray) XY(i int) (x,y float64) {
return xy.X[i], xy.Y[i]
}
Then one could just do
vals := XYArray{X: x, Y: y}
and directly add vals to a plotter.
https://godoc.org/github.com/btracey/myplot#VecXY :)
Still, I think the slice representation is more natural for gonum/plot. I think it's more common for users of the code, and it's easier to add in more information (say, z data). Frequently the data does need to exist as a slice even internal to plot, see for example the code for NewErrorPoints
So is the question, then, whether to replace the XYs type with VecXY or XYArray, or whether to replace the XYer interface with a concrete type in function arguments? i.e.
NewScatter(xys XYer)
would become
NewScatter(x, y []float64)
To me, the first option seems like a good idea. I see a tradeoff with the second option, where making the change would make things easier for a lot use cases (i.e., where the data to be plotted is already in x and y arrays), but it would make things more difficult for some use cases (e.g., plotting geographic locations, which typically are in the form struct{X, Y float64}).
I would lean toward keeping the current signature for the function arguments but replacing XYs with VecXY or XYVector as that seems to be most in line with the 'do less, enable more' philosophy, and allows the user to make decisions about their data format rather than having the decisions made for them by the plotting package.
I think your suggestion is reasonable, but just a bit of devil's advocate: Is it actually true that geographic data usually come as struct{X, Y float64})? Doesn't it usually come as struct{Lat, Long, Altitude, {other data} float64})? It doesn't affect your argument, but it does mean that some kind of type munging is always necessary for that kind of data.
Yes, although I think that supports my point rather than refutes it. Having an interface function argument means that the user can just implement the XYer interface for whatever complicated data type they have (which, as shown above, takes 6 lines of code), rather have having to mung. So, for example:
type Point {
Lat,Lon,Altitude, other, variables float64
}
type Points []Point
func (p Points) Len() int {
return len(p)
}
func (p Points) XY(i int) (x,y float64) {
return p[i].Lon, p[i].Lat
}
Yes, I agree it supports your point.
Maybe I'm confused, because I don't understand the problem. Doesn't everything take the XYer interface? You can implement that however you like. That's why I made it an interface.
On Wed, Jun 1, 2016, 12:24 Brendan Tracey [email protected] wrote:
Yes, I agree it supports your point.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gonum/plot/issues/284#issuecomment-223047609, or mute the thread https://github.com/notifications/unsubscribe/AAOXB2z8efGhNToSNjq8M8VeklCO56Fqks5qHbJZgaJpZM4Iq_th .
I think this ticket should be about providing an easy way to create XYer
from X and Y.
This can be done by adding XYArray
type as proposed by @ctessum (the name requires some thought though).
Or via function:
func ZipXY(x []float64, y []float64) XYer
Adding some convenience functions like:
NewScatterXY(x []float64, y []float64) (*Scatter, error)
might be good idea too.
@kortschak wrt tables and what not: I was browsing the interwebs the other day and stumbled upon @aclements' playground for providing a gg-inspired toolkit for Go: https://godoc.org/github.com/aclements/go-gg
especially: https://godoc.org/github.com/aclements/go-gg/table#Table
It's an interesting read :)
That is something that looks like what we would be likely to implement as part of a data frame package, but not what I was talking about here.
I'm really not sure this is worth it. The code for it is just
type zip struct { x, y []float64 }
func (z zip) Len() int { return len(z.x) }
func (z zip) XY(i int) (x, y float64) { return z.x[i], z.y[i] }
This is less than implementing sort.Interface.
you could flip it around: "sort"
provides implementations for sort.Strings
, sort.Float64s
and sort.Ints
and it's in the stdlib
(so the bar for inclusion is higher.)
adding just one exported function (func ZipXY(x, y []float64) XYer
) is IMHO hitting the sweet spot.
I have to agree with @sbinet about the usefulness of a ZipXY(x,y []float64) XYer)
function. A portion of the users will have existing slices of structs that can easily implement XYer, but there's a lot of data that starts in the form of one X vector and many Y's. That ends up requiring a zip-style function in each and every project that doesn't have a clean slice of structs which can implement XYer directly. Yes, it's a helper function, but it's not even close to an esoteric use.
Edit: By comparison sort only needs an interface implementations, whereas a set of Y slices requires a new constructor or closure before implementing the XYer interface.
What does the function give you? If the type is exported you get exactly the same functionality with a smaller API surface, unless there is a proposal that ZipXY
panics when len(x) != len(y)
, which the implementation may also do (though later)* and can be made to explicitly via the Len method.
* This is another issue I'd like to address again - that all data are copied by plot.
The copying was a conscious decision. If we don't copy the data, then it can change out from under the plotter — safer to copy it. Is copying really causing a problem for anyone? Either the plotter does not copy and stores only a summary of the data (boxplot), or the plotter copies the data and plots all of it (lines, scatter, etc). In the first case there is no copy, so it's not a problem. In the second case, if you are passing so much data that you just can't copy it, then you will certainly will have an issue when you go to plot that many points.
Yeah; we've had this discussion in the past. All our (gonum) data structures allow things to change underfoot unless they are created by the type/func. Because of this I think it's something worth looking at again. My interest is not size/allocations (which is our drive elsewhere) but flexibility.
As a new user of plot and plotter, I would definitely not want to see this change. The current XYer approach is elegant and easy to use with a myriad of data types. I am plotting a lot of things, and in all cases I have a slice or set of much more complicated data structures form which I generate an 'X' and 'Y' for plot. Data definitely does not exist as X, Y []float64. I could construct that, but it would be much less user friendly that the current XYer interfaces. As it is with XYer, I am even using cases with different interfaces pointing the same underlying data structures with different XY methods returning different 'X' and 'Y' values for different plots based on the same underlying data structures. If there is anything new, try to keep the XYer approach alongside it.
I don't think anyone here is suggesting to remove the XYer interface approach.
A ZipXY(x,y []float64) XYer)
would definitely be appreciated, that's actually something that I was just searching for as my Data tends to be in slices of X /y data (one slice for each)
Some convenience functions wouldn't hurt anyone, especially ones you would need to write again and again (or write an own package, which defeats the point of using a plotting package in the first place)