Tables.jl
Tables.jl copied to clipboard
Is `NamedTuple[(a=1,), (b=1,)]` a table?
Table 1.0 and 0.2 say yes:
julia> Tables.istable(NamedTuple[(a=1,), (b=1,)])
true
But if I understand the interface specification correctly, I think the desired behavior is:
julia> Tables.istable(Vector{NamedTuple})
false
julia> Tables.istable(Vector{NamedTuple{<:Any,Tuple{Int}}})
false
julia> Tables.istable(Vector{NamedTuple{(:a,)}})
true
julia> Tables.istable(Vector{NamedTuple{(:a,),Tuple{Int}}})
true
However, all return true
in Tables 1.0 and 0.2.
A PR to cleanup the const RowTable{T} = AbstractVector{T} where {T <: NamedTuple}
definition would be welcome; I will point out however that Tables.istable
is already not 100% bulletproof in that there are lots of valid tables for which Tables.istable
returns false
. It'd obviously be nice to avoid false positives in code that Tables.jl does control, but I'm just pointing out that people relying on Tables.istable
should hopefully be aware of its qualifications.
I think I understand the design choice to allow false negative of Tables.istable(::Type)
(as you'd want to decide it at run-time sometimes). But isn't it better to treat false positive of Tables.istable(::Type)
as a bug? Is there a case where it's useful to define possibly false positive Tables.istable(::Type)
?
(Though I guess defining the meaning of the correctness of "positive" is a bit hard. Presumably, throwing from iterate(Tables.rows(table))
is OK even when Tables.istable(typeof(table))
is true if table
is used in a "wrong way" (e.g., DB connection is closed). But I think all table interface should work for an object table
if Tables.istable(typeof(table))
and table
is a "valid" object by its own criteria. I think it is kind of like you can create invalid SubArray
by mutating the index array; the live object does not satisfy the array interface even though its type is AbstractArray
.)
It would be nice to have Tables.istable
return false for something like
julia> [(a=1, b=2), (c=3, d=4)]
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
(a = 1, b = 2)
(c = 3, d = 4)
but what about the following?
julia> [(a=1, b=2), (b=3, a=4)]
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
(a = 1, b = 2)
(b = 3, a = 4)
Currently this works as a table (e.g. I can do DataFrame([(a=1, b=2), (b=3, a=4)])
). And both examples have the exact same type...
This is unrelated to DataFrames.jl but how Tables.jl works:
julia> Tables.rowtable([(a=1, b=2), (b=3, a=4)])
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
(a = 1, b = 2)
(b = 3, a = 4)
julia> Tables.columntable([(a=1, b=2), (b=3, a=4)])
(a = [1, 4], b = [2, 3])
hi all is the issue resolved , if not I can work on it :)
Here you have an explanation of the current design https://bkamins.github.io/julialang/2023/09/01/tables.html
However, I am not sure what the design changes should be if we made them (see @quinnj comment above)