Inconsistent handling of eltype Decimals.Decimal (with silent errors?)
First of all, thank you for the amazing package! I have noticed unexpected behaviour that I wanted to point out.
Expected behaviour: rational numbers like 1.0 and 0.1 will be represented as Float; they can be saved and loaded again.
Actual behaviour:
When writing column with eltype Decimals.Decimal, Arrow.write(filename,df) will give a method error (see below) and Arrow.write(filename,df;compress=:lz4) will complete without an error, but the resulting table is wrong when re-read (see MWE below).
I've had a quick look at the code base and I cannot see any type checks - are those left to the user / MethodErrors?
MWE:
using Decimals
using DataFrames, Arrow
df=DataFrame(:a=>[Decimal(2.0)])
# this will fail with error that Decimal cannot be saved
Arrow.write("test.feather", df)
# nested task error: MethodError: no method matching write(::IOBuffer, ::Decimals.Decimal)
# this will succeed
Arrow.write("test.feather", df;compress=:lz4)
# but the loaded dataframe will be rubbish
df2=Arrow.Table("test.feather")|>DataFrame
# 1×1 DataFrame
# Row │ a
# │ Float64
# ─────┼─────────────
# 1 │ 2.1509e-314
Error stack trace from Arrow.write() without a keyword argument:
ERROR: TaskFailedException Stacktrace: [1] wait @ ./task.jl:345 [inlined] [2] close(writer::Arrow.Writer{IOStream}) @ Arrow ~/.julia/packages/Arrow/ZlMFU/src/write.jl:230 [3] open(::Arrow.var"#120#121"{DataFrame}, ::Type, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:file,), Tuple{Bool}}}) @ Base ./io.jl:386 [4] #write#119 @ ~/.julia/packages/Arrow/ZlMFU/src/write.jl:57 [inlined] [5] write(file_path::String, tbl::DataFrame) @ Arrow ~/.julia/packages/Arrow/ZlMFU/src/write.jl:56 [6] top-level scope @ REPL[14]:1 nested task error: MethodError: no method matching write(::IOBuffer, ::Decimals.Decimal) Closest candidates are: write(::IO, ::Any) at io.jl:672 write(::IO, ::Any, ::Any...) at io.jl:673 write(::Base.GenericIOBuffer, ::UInt8) at iobuffer.jl:442 ... Stacktrace: [1] write(io::IOBuffer, x::Decimals.Decimal) @ Base ./io.jl:672 [2] writearray(io::IOStream, #unused#::Type{Decimals.Decimal}, col::Vector{Union{Missing, Decimals.Decimal}}) @ Arrow ~/.julia/packages/Arrow/ZlMFU/src/utils.jl:50 [3] writebuffer(io::IOStream, col::Arrow.Primitive{Union{Missing, Decimals.Decimal}, Vector{Union{Missing, Decimals.Decimal}}}, alignment::Int64) @ Arrow ~/.julia/packages/Arrow/ZlMFU/src/arraytypes/primitive.jl:102 [4] write(io::IOStream, msg::Arrow.Message, blocks::Tuple{Vector{Arrow.Block}, Vector{Arrow.Block}}, sch::Base.RefValue{Tables.Schema}, alignment::Int64) @ Arrow ~/.julia/packages/Arrow/ZlMFU/src/write.jl:365 [5] macro expansion @ ~/.julia/packages/Arrow/ZlMFU/src/write.jl:149 [inlined] [6] (::Arrow.var"#122#124"{IOStream, Int64, Tuple{Vector{Arrow.Block}, Vector{Arrow.Block}}, Base.RefValue{Tables.Schema}, Arrow.OrderedChannel{Arrow.Message}})() @ Arrow ./threadingconstructs.jl:258
Package version [69666777] Arrow v2.3.0 [a93c6f00] DataFrames v1.3.4 [194296ae] LibPQ v1.14.0
versioninfo() (but it was the same on 1.7) Julia Version 1.8.0 Commit 5544a0fab76 (2022-08-17 13:38 UTC) Platform Info: OS: macOS (arm64-apple-darwin21.3.0) CPU: 8 × Apple M1 Pro WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1) Threads: 6 on 6 virtual cores