AxisIndices.jl
AxisIndices.jl copied to clipboard
Nested NamedAxisArrays Are Difficult to Read
Hi @Tokazama - love this package for the work going on in NeuriViz!
One issue that I have on the user interface side is that it is quite difficult to sometimes parse the output of a nested NamedAxisArray. To illustrate what I mean, let me show you. I have the following code which creates a nested NamedAxisArray:
subject_data = NamedAxisArray(
[NamedAxisArray(
[NamedAxisArray(
[
DataFrame(eeg_data, copycols=false),
DataFrame(electrodes_data, copycols=false),
DataFrame(event_data, copycols=false),
nosedir,
times,
sampling_freq,
],
information = [
:data,
:electrodes,
:events,
:nosedir,
:times,
:sampling_freq,
],
)],
session = [1],
)],
subject = [1],
)
This is desirable as the syntax becomes as easy as subject_data[subject = 1][session = 1][information = :electrodes] to access information.
Deepest Nested NamedAxisArray
Starting from the furthest nested NamedAxisArray, it is not terribly hard to read:
julia> subject_data[1][1]
6-element NamedDimsArray(AxisArray(::Array{Any,1}
• axes:
information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]
))
1
:data 206440×31 DataFrame. Omitted printing of 25 columns
│ Row │ x1 │ x2 │ x3 │ x4 │ x5 │ x6 │
│ │ Float32 │ Float32 │ Float32 │ Float32 │ Float32 │ Float32 │
├────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ 1 │ -19.9952 │ -46.3864 │ -28.3298 │ 81.1222 │ -67.0026 │ 21.798 │
│ 2 │ -19.5624 │ -42.738 │ -28.3298 │ 80.7722 │ -66.5652 │ 21.7112 │
│ 3 │ -19.1296 │ -39.6108 │ -28.1576 │ 80.3346 │ -66.2153 │ 21.6243 │
│ 4 │ -18.9564 │ -38.5685 │ -27.7271 │ 79.5471 │ -65.9529 │ 21.6243 │
│ 5 │ -19.1296 │ -39.8714 │ -27.3826 │ 78.7595 │ -65.603 │ 21.6243 │
│ 6 │ -19.1296 │ -42.5643 │ -27.2104 │ 78.4969 │ -65.3406 │ 21.4507 │
│ 7 │ -18.9564 │ -45.1703 │ -27.2965 │ 78.4094 │ -64.9907 │ 21.1901 │
⋮
│ 206433 │ 200.038 │ 1.73732 │ 135.708 │ -102.65 │ -18.7187 │ -110.988 │
│ 206434 │ 198.22 │ -3.3009 │ 133.469 │ -102.475 │ -20.2057 │ -111.074 │
│ 206435 │ 196.403 │ -7.81793 │ 131.919 │ -101.862 │ -21.4303 │ -110.467 │
│ 206436 │ 194.498 │ -12.6824 │ 131.402 │ -100.987 │ -22.0426 │ -109.338 │
│ 206437 │ 192.334 │ -18.5024 │ 132.005 │ -100.987 │ -21.7802 │ -108.122 │
│ 206438 │ 189.824 │ -25.3648 │ 132.952 │ -102.387 │ -21.0804 │ -107.861 │
│ 206439 │ 187.314 │ -32.401 │ 133.469 │ -104.313 │ -20.6431 │ -108.903 │
│ 206440 │ 185.496 │ -38.3947 │ 133.297 │ -105.538 │ -20.818 │ -110.467 │
:electrodes 31×4 DataFrame
│ Row │ name │ x │ y │ z │
│ │ String │ Float64 │ Float64 │ Float64 │
├─────┼────────┼─────────┼─────────┼─────────┤
│ 1 │ FP1 │ 0.83 │ 0.27 │ 0.48 │
│ 2 │ FP2 │ 0.83 │ -0.27 │ 0.48 │
│ 3 │ F3 │ 0.5 │ 0.4 │ 0.77 │
│ 4 │ F4 │ 0.5 │ -0.4 │ 0.77 │
│ 5 │ C3 │ 0.0 │ 0.51 │ 0.86 │
│ 6 │ C4 │ 0.0 │ -0.51 │ 0.86 │
│ 7 │ P3 │ -0.5 │ 0.4 │ 0.77 │
⋮
│ 24 │ P4" │ -0.66 │ -0.37 │ 0.65 │
│ 25 │ PZ" │ -0.72 │ -0.0 │ 0.69 │
│ 26 │ OZ │ -0.88 │ -0.0 │ 0.48 │
│ 27 │ I │ -0.97 │ -0.0 │ 0.23 │
│ 28 │ CB1" │ -0.93 │ 0.3 │ 0.23 │
│ 29 │ CB2" │ -0.93 │ -0.3 │ 0.23 │
│ 30 │ CB1 │ -0.79 │ 0.57 │ 0.23 │
│ 31 │ CB2 │ -0.79 │ -0.57 │ 0.23 │
:events 151×8 DataFrame. Omitted printing of 2 columns
│ Row │ onset │ duration │ sample │ trial_type │ response_time │ stim_file │
│ │ Float64 │ String │ String │ String │ String │ String │
├─────┼─────────┼──────────┼────────┼────────────┼───────────────┼────────────┤
│ 1 │ 5.035 │ n/a │ n/a │ stimulus │ 335 │ 105064.jpg │
│ 2 │ 5.37 │ n/a │ n/a │ response │ n/a │ n/a │
│ 3 │ 6.837 │ n/a │ n/a │ stimulus │ n/a │ 38068.jpg │
│ 4 │ 8.651 │ n/a │ n/a │ stimulus │ 289 │ 136095.jpg │
│ 5 │ 8.94 │ n/a │ n/a │ response │ n/a │ n/a │
│ 6 │ 10.801 │ n/a │ n/a │ stimulus │ n/a │ 38014.jpg │
│ 7 │ 12.684 │ n/a │ n/a │ stimulus │ n/a │ 82063.jpg │
⋮
│ 144 │ 193.182 │ n/a │ n/a │ stimulus │ n/a │ 63093.jpg │
│ 145 │ 195.219 │ n/a │ n/a │ stimulus │ n/a │ 307043.jpg │
│ 146 │ 197.224 │ n/a │ n/a │ stimulus │ 482 │ 194061.jpg │
│ 147 │ 197.706 │ n/a │ n/a │ response │ n/a │ n/a │
│ 148 │ 199.145 │ n/a │ n/a │ stimulus │ 325 │ 49069.jpg │
│ 149 │ 199.47 │ n/a │ n/a │ response │ n/a │ n/a │
│ 150 │ 201.014 │ n/a │ n/a │ stimulus │ n/a │ 83070.jpg │
│ 151 │ 203.063 │ n/a │ n/a │ stimulus │ n/a │ 166026.jpg │
:nosedir "+X"
:times 1:206440
:sampling_freq 1000
There are a few things that would be nice to have displayed better. First:
julia> subject_data[1][1]
6-element NamedDimsArray(AxisArray(::Array{Any,1}
• axes:
information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]
))
could possibly be better displayed as:
julia> subject_data[1][1]
6-element Axis
• axes:
information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]
Which I think looks clearer and easier to read. Second, rather than having the axis be displayed, it would be nice to have something nice like:
...
:events 151×8 DataFrame
:nosedir String
:times UnitRange
:sampling_freq Int
...
With optional verbosity levels (i.e. show me directly the values stored in these axes versus tell me the types and dimensions only).
Third, when printing DataFrames, It might be nice to have a new line after the end of printing each data frame for cleanness.
First Nested NamedAxisArray
Moving up one level in the nesting, things start to get very messy:
julia> subject_data[1]
1-element NamedDimsArray(AxisArray(::Array{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1}
• axes:
session = [1]
))
1
1 [206440×31 DataFrame. Omitted printing of 25 columns
│ Row │ x1 │ x2 │ x3 │ x4 │ x5 │ x6 │
│ │ Float32 │ Float32 │ Float32 │ Float32 │ Float32 │ Float32 │
├────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ 1 │ -19.9952 │ -46.3864 │ -28.3298 │ 81.1222 │ -67.0026 │ 21.798 │
│ 2 │ -19.5624 │ -42.738 │ -28.3298 │ 80.7722 │ -66.5652 │ 21.7112 │
│ 3 │ -19.1296 │ -39.6108 │ -28.1576 │ 80.3346 │ -66.2153 │ 21.6243 │
│ 4 │ -18.9564 │ -38.5685 │ -27.7271 │ 79.5471 │ -65.9529 │ 21.6243 │
│ 5 │ -19.1296 │ -39.8714 │ -27.3826 │ 78.7595 │ -65.603 │ 21.6243 │
│ 6 │ -19.1296 │ -42.5643 │ -27.2104 │ 78.4969 │ -65.3406 │ 21.4507 │
│ 7 │ -18.9564 │ -45.1703 │ -27.2965 │ 78.4094 │ -64.9907 │ 21.1901 │
⋮
│ 206433 │ 200.038 │ 1.73732 │ 135.708 │ -102.65 │ -18.7187 │ -110.988 │
│ 206434 │ 198.22 │ -3.3009 │ 133.469 │ -102.475 │ -20.2057 │ -111.074 │
│ 206435 │ 196.403 │ -7.81793 │ 131.919 │ -101.862 │ -21.4303 │ -110.467 │
│ 206436 │ 194.498 │ -12.6824 │ 131.402 │ -100.987 │ -22.0426 │ -109.338 │
│ 206437 │ 192.334 │ -18.5024 │ 132.005 │ -100.987 │ -21.7802 │ -108.122 │
│ 206438 │ 189.824 │ -25.3648 │ 132.952 │ -102.387 │ -21.0804 │ -107.861 │
│ 206439 │ 187.314 │ -32.401 │ 133.469 │ -104.313 │ -20.6431 │ -108.903 │
│ 206440 │ 185.496 │ -38.3947 │ 133.297 │ -105.538 │ -20.818 │ -110.467 │, 31×4 DataFrame
│ Row │ name │ x │ y │ z │
│ │ String │ Float64 │ Float64 │ Float64 │
├─────┼────────┼─────────┼─────────┼─────────┤
│ 1 │ FP1 │ 0.83 │ 0.27 │ 0.48 │
│ 2 │ FP2 │ 0.83 │ -0.27 │ 0.48 │
│ 3 │ F3 │ 0.5 │ 0.4 │ 0.77 │
│ 4 │ F4 │ 0.5 │ -0.4 │ 0.77 │
│ 5 │ C3 │ 0.0 │ 0.51 │ 0.86 │
│ 6 │ C4 │ 0.0 │ -0.51 │ 0.86 │
│ 7 │ P3 │ -0.5 │ 0.4 │ 0.77 │
⋮
│ 24 │ P4" │ -0.66 │ -0.37 │ 0.65 │
│ 25 │ PZ" │ -0.72 │ -0.0 │ 0.69 │
│ 26 │ OZ │ -0.88 │ -0.0 │ 0.48 │
│ 27 │ I │ -0.97 │ -0.0 │ 0.23 │
│ 28 │ CB1" │ -0.93 │ 0.3 │ 0.23 │
│ 29 │ CB2" │ -0.93 │ -0.3 │ 0.23 │
│ 30 │ CB1 │ -0.79 │ 0.57 │ 0.23 │
│ 31 │ CB2 │ -0.79 │ -0.57 │ 0.23 │, 151×8 DataFrame. Omitted printing of 2 columns
│ Row │ onset │ duration │ sample │ trial_type │ response_time │ stim_file │
│ │ Float64 │ String │ String │ String │ String │ String │
├─────┼─────────┼──────────┼────────┼────────────┼───────────────┼────────────┤
│ 1 │ 5.035 │ n/a │ n/a │ stimulus │ 335 │ 105064.jpg │
│ 2 │ 5.37 │ n/a │ n/a │ response │ n/a │ n/a │
│ 3 │ 6.837 │ n/a │ n/a │ stimulus │ n/a │ 38068.jpg │
│ 4 │ 8.651 │ n/a │ n/a │ stimulus │ 289 │ 136095.jpg │
│ 5 │ 8.94 │ n/a │ n/a │ response │ n/a │ n/a │
│ 6 │ 10.801 │ n/a │ n/a │ stimulus │ n/a │ 38014.jpg │
│ 7 │ 12.684 │ n/a │ n/a │ stimulus │ n/a │ 82063.jpg │
⋮
│ 144 │ 193.182 │ n/a │ n/a │ stimulus │ n/a │ 63093.jpg │
│ 145 │ 195.219 │ n/a │ n/a │ stimulus │ n/a │ 307043.jpg │
│ 146 │ 197.224 │ n/a │ n/a │ stimulus │ 482 │ 194061.jpg │
│ 147 │ 197.706 │ n/a │ n/a │ response │ n/a │ n/a │
│ 148 │ 199.145 │ n/a │ n/a │ stimulus │ 325 │ 49069.jpg │
│ 149 │ 199.47 │ n/a │ n/a │ response │ n/a │ n/a │
│ 150 │ 201.014 │ n/a │ n/a │ stimulus │ n/a │ 83070.jpg │
│ 151 │ 203.063 │ n/a │ n/a │ stimulus │ n/a │ 166026.jpg │, "+X", 1:206440, 1000]
The following seems quite messy in my opinion:
julia> subject_data[1]
1-element NamedDimsArray(AxisArray(::Array{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1}
• axes:
session = [1]
))
1
It would be nice if it could be more like:
julia> subject_data[1]
1-element Axis
• axes:
session = [1]
Furthermore, I am not sure what happens but it recursively descends and displays the values of the deepest nested NamedAxisArray with no explanation nor information about that nested NamedAxisArray's fields.
Highest level
At the highest level is when things become the most obfuscatory:
julia> subject_data
1-element NamedDimsArray(AxisArray(::Array{NamedDims.NamedDimsArray{(:session,),NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1,AxisArray{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1,Array{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1},Tuple{Axis{Int64,Int64,Array{Int64,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1}
• axes:
subject = [1]
))
Conclusion
In short, it would be nice to somehow adjust the verbosity of the output and maybe instead show something like this for the nested NamedAxisArray:
julia> subject_data
┌1-element Axis
│ • axes:
│ subject = [1]
│ ┌1-element Axis
│ │• axes:
│ │ session = [1]
│ │ ┌6-element Axis
│ │ │ • axes:
│ │ │ information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]
│ │ └ NamedDimsArray(AxisArray(::Array{Any,1}))
│ └ NamedDimsArray(AxisArray(::Array{Any,1}))
└ NamedDimsArray(AxisArray(::Array{Any,1}))
Of course, it's not perfect, but I like it a bit better than what is currently displayed.
What do you think @Tokazama ? I feel like this would actually lead to better tracebacks and increase the ease of debugging issues.
I wonder if AbstractTrees.jl would help with this... Seems promising!
I wonder if AbstractTrees.jl would help with this... Seems promising!
I think this is probably a better direction to go as a graph/tree could solve a lot of this. I've contemplated doing more with this through AxisGraphs.jl, but I'm still incubating ideas on the specifics of how to approach this (I have a fair amount of scratch code thinking through this, but it's not organized or tested yet).
Did you have any specific ideas for what you'd like out of using something like AbstractTrees.jl in terms of interface or is printing your main concern?