DataFrames.jl icon indicating copy to clipboard operation
DataFrames.jl copied to clipboard

In-memory tabular data in Julia

Results 170 DataFrames.jl issues
Sort by recently updated
recently updated
newest added

There are two natural cases in `describe` that are currently hard: 1. get the number of rows; 2. get the number of non-missing rows. @nalimilan - would you have any...

feature

`filter` currently broadcasts the predicate function over the input column(s) to get a logical index. This means that even if you filter on an `AcceleratedArray` column, the optimized `findall` method...

feature

We can use `findall` instead of current approach which will be faster with e.g. https://github.com/andyferris/AcceleratedArrays.jl

performance

The implementation would be: ``` maximum(df::AbstractDataFrame, col::ColumnIndex) = df[argmax(df.col), :] minimum(df::AbstractDataFrame, col::ColumnIndex) = df[argmin(df.col), :] maximum(gdf::GroupedDataFrame, col::ColumnIndex) = combine(gdf, sdf -> maximum(sdf, col)) minimum(gdf::GroupedDataFrame, col::ColumnIndex) = combine(gdf, sdf -> maximum(sdf,...

feature

I'm wrapping up https://github.com/JuliaParallel/Dagger.jl/pull/344 and I managed to narrow down the list of things I need to import from DataFrames to get the DataFrames-style select somewhat working Here's the list:...

ecosystem

Keep the default to a single thread until we find a reliable way of predicting a reasonably optimal number of threads.

performance
grouping

The infrastructure is ready for this. We just need to decide when it is worth to do it, make, and document changes.

performance

This comes up as a question too often. Given an ordered *N* sequence of, say, DateTimes and data vectors each of *N* data values, DataFrames makes forming a frame of...

question

This is a speculative idea. Maybe we could define `GroupedDataFrame` to be callable like this: ``` (gdf::GroupedDataFrame)(idxs...) = gdf[idxs] ``` In this way instead of writing: ``` gdf[("val",)] ``` users...

decision