Plotting
Plotting functionalities for exploratory data analysis
Basically a wrapper of JFreeChart with methods to make it easy to create datasets to be rendered from Geotools objects
Hey Victor,
So got a chance to look over this in detail. In general i think its great. I have thought of some ways to reorganize things though and I am interested in what you think.
First thing is a slight reorganization of the plotting functions and modules. What I am thinking is this.
plot.bar.xy
plot.bar.interval
plot.bar.category
plot.regression.linear
plot.regression.power
plot.scatterplot
plot.pie
plot.box
Second would be to try and make inputs to the functions as consistent as possible. Some plotting functions take a list of (x,y) tuples (like bar.xy for example), while others take separate x/y parameters (like scatterplot). I think it would make sense to try and make this as uniform as possible. I personally like just a single input argument, mostly because its easy in python to zip up separate lists into single list of tuples.
And finally would be a reorganization of the functions module. We already have some methods on layer for generating histograms, extrema, etc... I was thinking (and actually goign to post to the geoscript list about this) about creating a separate Stats layer to encapsulate all this, moving those existing function from Layer to Stats. This is what i had in mind:
l = Shapefile(...)
s = l.stats()
s.summary()
s.histogram()
...
Stats could be created with an implicit filter as well.
s = l.stats('MALE_POP > FEMALE_POP)'
Then every function to stats woudl take that into accoutn without having to explicitly supply a filter. However we could allow for an explicit filter that would be joined with the implicit one (via AND). So something like:
s = l.stats('MALE_POP > FEMALE_POP')
s.summary('MALE_POP > 0.5')
What do you think?
It all sounds good to me.
The different ways of input were just to make it easier to adapt to outputs from functions, but I guess it is better to work on that a bit more and have something more homogeneous.
And I clearly support moving stats out of layer classes, It makes much more sense IMHO, plus those functions can be used for other types of data, so you can pass a raster layer instead of a vector one, or any other think that can be used to get a histogram (or whatever statistical function it is)
Should I start working on that? It doesn't look to me like a lot of work, all changes should be easy to make
Cool. I actually started playing around with the merge and some of these ideas on a separate branch if it makes sense to start from there. What i currently have is here:
https://github.com/jdeolive/geoscript-py/tree/plots
If you wanted to pick it up from there that would be great.