arrow-julia icon indicating copy to clipboard operation
arrow-julia copied to clipboard

Error when open files with Date.Time containing MICROSECONDs

Open MrHenning opened this issue 1 year ago • 0 comments

When opening a table that contains a Time object containing MICROSECONS (which I think is the default with python/pandas) I get an error.

Example:

  1. Create python/pandas dataframe:
    pd.DataFrame(dict(
      i=range(0,10),
      time=[datetime.time(hour=i) for i in range(0,10)]
    )).to_feather('~/python_time_df.arrow')
    
  2. read file in julia:
    tb = Arrow.Table("~/python_time_df.arrow", convert=true)
    tb.time
    
    yields the error
Failed to show value:
MethodError: no method matching Int64(::Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64})
Closest candidates are:
(::Type{T})(!Matched::AbstractChar) where T<:Union{Int32, Int64} at char.jl:51
(::Type{T})(!Matched::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
(::Type{T})(!Matched::BigInt) where T<:Union{Int128, Int16, Int32, Int64, Int8} at gmp.jl:359
...

    - Dates.Time(::Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64}, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64, ::Dates.AMPM)@types.jl:412
    - fromarrow(::Type{Dates.Time}, ::Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64})@ArrowTypes.jl:157
    - fromarrow(::Type{Union{Missing, Dates.Time}}, ::Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64})@ArrowTypes.jl:161
    - [email protected]:46[inlined]
    - [email protected]:1274[inlined]
    - [email protected]:1241[inlined]
    - isassigned(::Arrow.Primitive{Union{Missing, Dates.Time}, Vector{Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64}}}, ::Int64, ::Int64)@abstractarray.jl:565
    - alignment(::IOContext{IOBuffer}, ::AbstractVecOrMat, ::Vector{Int64}, ::Vector{Int64}, ::Int64, ::Int64, ::Int64, ::Int64)@arrayshow.jl:68
    - _print_matrix(::IOContext{IOBuffer}, ::AbstractVecOrMat, ::String, ::String, ::String, ::String, ::String, ::String, ::Int64, ::Int64, ::UnitRange{Int64}, ::UnitRange{Int64})@arrayshow.jl:207
    - print_matrix(::IOContext{IOBuffer}, ::Arrow.Primitive{Union{Missing, Dates.Time}, Vector{Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64}}}, ::String, ::String, ::String, ::String, ::String, ::String, ::Int64, ::Int64)@arrayshow.jl:171
    - [email protected]:358[inlined]
    - show(::IOContext{IOBuffer}, ::MIME{Symbol("text/plain")}, ::Arrow.Primitive{Union{Missing, Dates.Time}, Vector{Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64}}})@arrayshow.jl:399
    - show_richest(::IOContext{IOBuffer}, ::Any)@PlutoRunner.jl:1157
    - [email protected]:1095[inlined]
    - format_output_default(::Any, ::Any)@PlutoRunner.jl:995
    - var"#format_output#60"(::IOContext{Base.DevNull}, ::typeof(Main.PlutoRunner.format_output), ::Any)@PlutoRunner.jl:1012
    - formatted_result_of(::Base.UUID, ::Base.UUID, ::Bool, ::Vector{String}, ::Nothing, ::Module)@PlutoRunner.jl:905
    - top-level [email protected]:476

Defining

ArrowTypes.fromarrow(::Type{Dates.Time}, x::Arrow.Time{Arrow.Flatbuf.TimeUnits.MICROSECOND, Int64}) = convert(Dates.Time, x)

seems to fix the error.

MrHenning avatar Nov 24 '22 09:11 MrHenning