arrow-julia icon indicating copy to clipboard operation
arrow-julia copied to clipboard

Serializing `Dict{String,Real}` result in garbage values

Open omus opened this issue 3 years ago • 2 comments

Serializing a Dict which contains Bool and Float64 values results in a Arrow generating garbage values:

julia> d = Dict("is_valid" => true,"probability" => 0.53495216)
Dict{String, Real} with 2 entries:
  "is_valid"    => true
  "probability" => 0.534952

julia> t = Arrow.Table(Arrow.tobuffer((; value=[d])))
Arrow.Table with 1 rows, 1 columns, and schema:
 :value  Dict{String, Float64}

julia> t.value
1-element Arrow.Map{Dict{String, Float64}, Int32, Arrow.Struct{NamedTuple{(:key, :value), Tuple{String, Float64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.Primitive{Float64, Vector{Float64}}}}}:
 Dict("is_valid" => -6.6622794774424345e159, "probability" => 3.1e-322)

Note that pre-converting the values to Float64 doesn't result in this behaviour:

julia> d = Dict{String,Float64}("is_valid" => true,"probability" => 0.53495216)
Dict{String, Float64} with 2 entries:
  "is_valid"    => 1.0
  "probability" => 0.534952

julia> t = Arrow.Table(Arrow.tobuffer((; value=[d])))
Arrow.Table with 1 rows, 1 columns, and schema:
 :value  Dict{String, Float64}

julia> t.value
1-element Arrow.Map{Dict{String, Float64}, Int32, Arrow.Struct{NamedTuple{(:key, :value), Tuple{String, Float64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.Primitive{Float64, Vector{Float64}}}}}:
 Dict("is_valid" => 1.0, "probability" => 0.53495216)

omus avatar Aug 12 '21 14:08 omus

Yeah, similar to what we do w/ arrays, we should probably try to enforce the Dict valtype with the concrete_or_concreteunion machinery in ArrowTypes.jl. Or we at least need a check in map.jl that it's a concrete type/union when serializing.

quinnj avatar Aug 12 '21 22:08 quinnj

I noticed this has been fixed by the PR above. Will it be included in the release any time soon?

FuZhiyu avatar Aug 31 '22 02:08 FuZhiyu

The fix from #305 was included in Arrow.jl v2.3+. I'll close this issue then, unless folks think there is more that should be done in this case.

ericphanson avatar Nov 28 '22 14:11 ericphanson