StatsPlots.jl icon indicating copy to clipboard operation
StatsPlots.jl copied to clipboard

Move `@df` to separate package?

Open ssfrr opened this issue 5 years ago • 12 comments

I often use the @df macro, but pretty rarely use any of the fancy plot types. What would you think about splitting it out into its own package that StatsPlots can re-export? Every time I pull in StatsPlots and its dependencies, then wait for it to precompile, I wish I could just get @df by itself. In my mind it's pretty orthogonal to plotting - I use it any time I want to conveniently refer to columns of my data and treat them like Vectors.

ssfrr avatar Feb 06 '20 19:02 ssfrr

Fine with me. @piever ?

mkborregaard avatar Feb 06 '20 19:02 mkborregaard

I also think it does not really belong here, but I am not sure what would be the correct package.

piever avatar Feb 06 '20 20:02 piever

Can't it be a standalone package?

daschw avatar Feb 06 '20 20:02 daschw

I think that would be odd. Especially because there already is DataFramesMeta.@with that does something very similar (but for DataFrames, rather than arbitrary tables). Maybe it'd make sense to put it in some TablesMeta package. The only thing I'm afraid of is that it also has some code for automatic labelling that does not really make sense in general.

piever avatar Feb 06 '20 20:02 piever

So the takehome is that if @ssfrr can come up with a good design we wouldn't be opposed in principle, but it isn't easy to see that good design?

mkborregaard avatar Feb 06 '20 20:02 mkborregaard

The only thing I'm afraid of is that it also has some code for automatic labelling that does not really make sense in general.

Ah, I didn't know there was plotting-specific stuff in there. I probably can't set aside much time in the near future to see if I can extract @df into something that makes sense standalone, but it's good to know you're relatively open to it, so I'll look into it when I can.

ssfrr avatar Feb 06 '20 21:02 ssfrr

I think it should work with arbitrary functions, but there is a try-catch in there which will kill performance in tight loops. https://github.com/JuliaPlots/StatsPlots.jl/blob/dd80e7627daf00ffd227ed0cd026f076a4c6bdcd/src/df.jl#L151-L167

asinghvi17 avatar Feb 18 '20 03:02 asinghvi17

Here's a radical (?) idear - why not upstream @df and some other stuff (like boxplots) to Plots itself? The DataFrames support was separated from Plots back when DataFrames would import half the ecosystem. With the present Tables approach that seems no longer necessary?

mkborregaard avatar Mar 09 '20 10:03 mkborregaard

I like it. So StatsPlots would only keep the recipes that depend on StatsBase, Distributions, etc.?

daschw avatar Mar 09 '20 11:03 daschw

Yes that's the idea (though Plots already has a StatsBase dep)

mkborregaard avatar Mar 09 '20 11:03 mkborregaard

Here's a radical (?) idear - why not upstream @df and some other stuff (like boxplots) to Plots itself?

Is this still a plan everyone likes? Also, I wonder if @df is misleading since now it supports anything that implements the Tables interface, and perhaps @df should be deprecated in favor of @table?

sethaxen avatar Jul 08 '21 21:07 sethaxen

I'm still in favour. I like the df though, as it's nice to type (which is essentially the whole idea of the macro), and @table feels like we're stealing a word that shouldn't belong here.

mkborregaard avatar Jul 08 '21 21:07 mkborregaard