DataFrames.jl
DataFrames.jl copied to clipboard
In-memory tabular data in Julia
There are two natural cases in `describe` that are currently hard: 1. get the number of rows; 2. get the number of non-missing rows. @nalimilan - would you have any...
`filter` currently broadcasts the predicate function over the input column(s) to get a logical index. This means that even if you filter on an `AcceleratedArray` column, the optimized `findall` method...
We can use `findall` instead of current approach which will be faster with e.g. https://github.com/andyferris/AcceleratedArrays.jl
The implementation would be: ``` maximum(df::AbstractDataFrame, col::ColumnIndex) = df[argmax(df.col), :] minimum(df::AbstractDataFrame, col::ColumnIndex) = df[argmin(df.col), :] maximum(gdf::GroupedDataFrame, col::ColumnIndex) = combine(gdf, sdf -> maximum(sdf, col)) minimum(gdf::GroupedDataFrame, col::ColumnIndex) = combine(gdf, sdf -> maximum(sdf,...
I'm wrapping up https://github.com/JuliaParallel/Dagger.jl/pull/344 and I managed to narrow down the list of things I need to import from DataFrames to get the DataFrames-style select somewhat working Here's the list:...
Keep the default to a single thread until we find a reliable way of predicting a reasonably optimal number of threads.
The infrastructure is ready for this. We just need to decide when it is worth to do it, make, and document changes.
This comes up as a question too often. Given an ordered *N* sequence of, say, DateTimes and data vectors each of *N* data values, DataFrames makes forming a frame of...
This is a speculative idea. Maybe we could define `GroupedDataFrame` to be callable like this: ``` (gdf::GroupedDataFrame)(idxs...) = gdf[idxs] ``` In this way instead of writing: ``` gdf[("val",)] ``` users...