Tables.jl
Tables.jl copied to clipboard
Is `NamedTuple[(a=1,), (b=1,)]` a table?
Table 1.0 and 0.2 say yes:
julia> Tables.istable(NamedTuple[(a=1,), (b=1,)])
true
But if I understand the interface specification correctly, I think the desired behavior is:
julia> Tables.istable(Vector{NamedTuple})
false
julia> Tables.istable(Vector{NamedTuple{<:Any,Tuple{Int}}})
false
julia> Tables.istable(Vector{NamedTuple{(:a,)}})
true
julia> Tables.istable(Vector{NamedTuple{(:a,),Tuple{Int}}})
true
However, all return true in Tables 1.0 and 0.2.
A PR to cleanup the const RowTable{T} = AbstractVector{T} where {T <: NamedTuple} definition would be welcome; I will point out however that Tables.istable is already not 100% bulletproof in that there are lots of valid tables for which Tables.istable returns false. It'd obviously be nice to avoid false positives in code that Tables.jl does control, but I'm just pointing out that people relying on Tables.istable should hopefully be aware of its qualifications.
I think I understand the design choice to allow false negative of Tables.istable(::Type) (as you'd want to decide it at run-time sometimes). But isn't it better to treat false positive of Tables.istable(::Type) as a bug? Is there a case where it's useful to define possibly false positive Tables.istable(::Type)?
(Though I guess defining the meaning of the correctness of "positive" is a bit hard. Presumably, throwing from iterate(Tables.rows(table)) is OK even when Tables.istable(typeof(table)) is true if table is used in a "wrong way" (e.g., DB connection is closed). But I think all table interface should work for an object table if Tables.istable(typeof(table)) and table is a "valid" object by its own criteria. I think it is kind of like you can create invalid SubArray by mutating the index array; the live object does not satisfy the array interface even though its type is AbstractArray.)
It would be nice to have Tables.istable return false for something like
julia> [(a=1, b=2), (c=3, d=4)]
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
(a = 1, b = 2)
(c = 3, d = 4)
but what about the following?
julia> [(a=1, b=2), (b=3, a=4)]
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
(a = 1, b = 2)
(b = 3, a = 4)
Currently this works as a table (e.g. I can do DataFrame([(a=1, b=2), (b=3, a=4)])). And both examples have the exact same type...
This is unrelated to DataFrames.jl but how Tables.jl works:
julia> Tables.rowtable([(a=1, b=2), (b=3, a=4)])
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
(a = 1, b = 2)
(b = 3, a = 4)
julia> Tables.columntable([(a=1, b=2), (b=3, a=4)])
(a = [1, 4], b = [2, 3])
hi all is the issue resolved , if not I can work on it :)
Here you have an explanation of the current design https://bkamins.github.io/julialang/2023/09/01/tables.html
However, I am not sure what the design changes should be if we made them (see @quinnj comment above)