JSONTables.jl icon indicating copy to clipboard operation
JSONTables.jl copied to clipboard

Heterogeneous data sometimes detect wrong columns type

Open attdona opened this issue 3 years ago • 2 comments

There are combinations of heterogeneus data where the wrong column types are discovered.

For example:

using JSONTables
using Tables

nonhomogenous = """
[
    {"a": 1, "b": 2, "c": 3},
    {"b": 4, "c": 8, "d": 5}
]
"""

JSONTables.jsontable(nonhomogenous)

You got:

JSONTables.Table{false, JSON3.Array{JSON3.Object, Base.CodeUnits{UInt8, String}, Vector{UInt64}}}([:a, :b, :c, :d], Dict{Symbol, Type}(:a => Union{Missing, Int64}, :b => Int64, :d => Int64, :c => Int64), JSON3.Object[{
   "a": 1,
   "b": 2,
   "c": 3
}, {
   "b": 4,
   "c": 8,
   "d": 5
}])

d type is detected as Int64 and this throws an error when building a table:

ct = Tables.columntable(jt)
ERROR: MethodError: Cannot `convert` an object of type Missing to an object of type Int64

attdona avatar Nov 29 '21 17:11 attdona

if I understand correctly the culprit is:

https://github.com/JuliaData/JSONTables.jl/blob/272624bbe7e3594efe96c8278ae4f3a3b4e73f15/src/JSONTables.jl#L65

This fix should works:

types[k] = Union{Missing, missT(typeof(v))}

attdona avatar Nov 29 '21 17:11 attdona

Yes, I think that's correct; mind making a PR @attdona ?

quinnj avatar Nov 30 '21 04:11 quinnj