arrow-julia icon indicating copy to clipboard operation
arrow-julia copied to clipboard

filtering DataFrame loaded from feather file triggers `deleteat!` error

Open markmbaum opened this issue 2 years ago • 0 comments

I'm loading a table from feather file in a straightforward way

df = filename |> Arrow.Table |> DataFrame

and attempting to filter it based on the values of one column

filter!(row -> row.x < 7, df)

but this throws the following error


ERROR: MethodError: no method matching deleteat!(::Arrow.Primitive{Float32, Vector{Float32}}, ::Vector{Int64})
Closest candidates are:
  deleteat!(::DataValues.DataValueVector, ::Any) at C:\Users\markm\.julia\packages\DataValues\N7oeL\src\array\datavaluevector.jl:168
  deleteat!(::SubDataFrame, ::Any) at C:\Users\markm\.julia\packages\DataFrames\6xBiG\src\subdataframe\subdataframe.jl:293
  deleteat!(::PooledArrays.PooledVector, ::Any) at C:\Users\markm\.julia\packages\PooledArrays\tQueO\src\PooledArrays.jl:627
  ...
Stacktrace:
 [1] _deleteat!_helper(df::DataFrame, drop::Vector{Int64})
   @ DataFrames C:\Users\markm\.julia\packages\DataFrames\6xBiG\src\dataframe\dataframe.jl:1065
 [2] deleteat!(df::DataFrame, inds::Vector{Int64})
   @ DataFrames C:\Users\markm\.julia\packages\DataFrames\6xBiG\src\dataframe\dataframe.jl:1042
 [3] filter!(f::Function, df::DataFrame)
   @ DataFrames C:\Users\markm\.julia\packages\DataFrames\6xBiG\src\abstractdataframe\abstractdataframe.jl:1211
 [4] top-level scope
   @ REPL[6]:1

The problem can be fixed by reallocating the columns

df = mapcols(Vector, df)

But it seems like filtering (and potentially other operations) should work on a data frame after loading from feather without further manipulation. Not a huge deal and perhaps this is an issue better directed to DataFrames.jl, but I thought worth pointing out.

Also my versions are

Arrow v2.3.0
DataFrames v1.3.4

markmbaum avatar Jun 01 '22 15:06 markmbaum