DataFrames.jl issues

add length (or nrow) to describe

2

There are two natural cases in `describe` that are currently hard: 1. get the number of rows; 2. get the number of non-missing rows. @nalimilan - would you have any...

bkamins

feature

filter compatibility with AcceleratedArrays optimized methods

7

`filter` currently broadcasts the predicate function over the input column(s) to get a logical index. This means that even if you filter on an `AcceleratedArray` column, the optimized `findall` method...

nalimilan

feature

Improve filter performance for single column source

1

We can use `findall` instead of current approach which will be faster with e.g. https://github.com/andyferris/AcceleratedArrays.jl

bkamins

performance

Consider adding maximum and minimum

3

The implementation would be: ``` maximum(df::AbstractDataFrame, col::ColumnIndex) = df[argmax(df.col), :] minimum(df::AbstractDataFrame, col::ColumnIndex) = df[argmin(df.col), :] maximum(gdf::GroupedDataFrame, col::ColumnIndex) = combine(gdf, sdf -> maximum(sdf, col)) minimum(gdf::GroupedDataFrame, col::ColumnIndex) = combine(gdf, sdf -> maximum(sdf,...

bkamins

feature

Functions/structs to move out of DataFrames to DataAPI (DataFrames syntax for DTable)

10

I'm wrapping up https://github.com/JuliaParallel/Dagger.jl/pull/344 and I managed to narrow down the list of things I need to import from DataFrames to get the DataFrames-style select somewhat working Here's the list:...

krynju

ecosystem

Support multithreading in groupreduce

30

Keep the default to a single thread until we find a reliable way of predicting a reasonably optimal number of threads.

nalimilan

performance

grouping

Add multithreading to transformations of AbstractDataFrame

2

The infrastructure is ready for this. We just need to decide when it is worth to do it, make, and document changes.

bkamins

performance

update precompilation for 1.4 release

2

bkamins

ecosystem

time sequences with holes --> no holes in times, missings for data in introduced rows

7

This comes up as a question too often. Given an ordered *N* sequence of, say, DateTimes and data vectors each of *N* data values, DataFrames makes forming a frame of...

JeffreySarnoff

question

Make row lookup easier

37

This is a speculative idea. Maybe we could define `GroupedDataFrame` to be callable like this: ``` (gdf::GroupedDataFrame)(idxs...) = gdf[idxs] ``` In this way instead of writing: ``` gdf[("val",)] ``` users...

bkamins

decision

DataFrames.jl
DataFrames.jl copied to clipboard

Metadata

add length (or nrow) to describe

filter compatibility with AcceleratedArrays optimized methods

Improve filter performance for single column source

Consider adding maximum and minimum

Functions/structs to move out of DataFrames to DataAPI (DataFrames syntax for DTable)

Support multithreading in groupreduce

Add multithreading to transformations of AbstractDataFrame

update precompilation for 1.4 release

time sequences with holes --> no holes in times, missings for data in introduced rows

Make row lookup easier

← Metadata

Owner

Metadata

DataFrames.jl DataFrames.jl copied to clipboard

Metadata

← Metadata

Owner

Metadata

DataFrames.jl
DataFrames.jl copied to clipboard