DataFrames.jl
DataFrames.jl copied to clipboard
Unsigned Int displayed as Int
I'm not sure if it is by design, but I was slightly surprised by the way UInt
are displayed when in a DataFrame
compared to regular vectors, i.e. decimal vs. hexadecimal notation:
julia> using DataFrames
julia> v = UInt8[1,125,253]
3-element Vector{UInt8}:
0x01
0x7d
0xfd
julia> DataFrame(:v => v)
3×1 DataFrame
Row │ v
│ UInt8
─────┼───────
1 │ 1
2 │ 125
3 │ 253
Yes - we have more such inconsistencies - especially for Bool
type. The question is - do you think it is problematic? Note that we display eltype
on top.
CC @ronisbr
Certainly not a big deal and mostly about aesthetics, yet it took me a double take to the header to make sure no unexpected conversion had happened behind the scenes.
Also, I'd argue that when dealing with bit patterns or masks, a lot is lost with decimal notation:
julia> v = [0x00ff, 0xff00]
2-element Vector{UInt16}:
0x00ff
0xff00
julia> DataFrame(:v => v)
2×1 DataFrame
Row │ v
│ UInt16
─────┼────────
1 │ 255
2 │ 65280
@ronisbr - do you remember why we decided to go this way (apart from handling Bool
as a special case, which I think we can keep as is - i.e. printing true
and false
)?
I think we are not handling any special cases. After a very long discussion with me, you, and @nalimilan, we decided to be consistent with print
. What we have is exactly what is obtained from print
, in all cases.
I think the only changes were on nothing
and missing
.
I forgot to mention something! If you want something close to what Julia uses by default in REPL, you can change the renderer to show
using:
julia> show(df, renderer = :show)
3×1 DataFrame
Row │ v
│ UInt8
─────┼───────
1 │ 0x01
2 │ 0x7d
3 │ 0xfd
@nalimilan - I guess, especially given the last comment by @ronisbr we can close this. OK?
Well, yeah, at least it works as intended. IIRC I advocated using show
for most types except a few special common types (like strings) during the long discussion we had when moving to PrettyTables. Anyway we can change this after 1.0 if we want.
Instead of hardcoding either one, I was wondering if it has been considered to put some of these settings in some global variable that users can easily customize to their taste (similar to how options()
works in R).
Beside renderer
, users might have their preferences also regarding nosubheader
, show_row_number
, hlines
to name a few.
I guess the advanced user can already achieve this by carefully overriding Base.show
but it sounds rather cumbersome.
I would prefer to have it documented how to do it. We tried very hard for years to avoid global state of DataFrames.jl, as having such a state is error prone and does not play well with multi-threading.
The very first PrettyTables.jl implementation I tried to do here in DataFrames.jl had a global state that you could modify. However, this added a huge performance loss in time to print the first table (it was almost 3x slower).