StatsBase.jl icon indicating copy to clipboard operation
StatsBase.jl copied to clipboard

describe() for Dates

Open jtrakk opened this issue 5 years ago • 2 comments

describe() doesn't give much information about Dates.

DataFrames.describe(Dates.Date.(2000:2005))
Summary Stats:
Length:         6
Type:           Dates.Date
Number Unique:  6

whereas describe(1:4) gives more info.

Summary Stats:
Length:         4
Missing Count:  0
Mean:           2.500000
Minimum:        1.000000
1st Quartile:   1.750000
Median:         2.500000
3rd Quartile:   3.250000
Maximum:        4.000000
Type:           Int64

jtrakk avatar Oct 13 '20 18:10 jtrakk

Currently we only compute these informations for numbers. What we could do instead is try to compute them, and if that fails fall back to the output for categorical data.

nalimilan avatar Oct 15 '20 13:10 nalimilan

When the dates are embedded in a DataFrame, describe() does give min/max:

describe(DataFrame(a=Date.(2000:2005)))
1×8 DataFrame
│ Row │ variable │ mean    │ min        │ median  │ max        │ nunique │ nmissing │ eltype   │
│     │ Symbol   │ Nothing │ Date       │ Nothing │ Date       │ Int64   │ Nothing  │ DataType │
├─────┼──────────┼─────────┼────────────┼─────────┼────────────┼─────────┼──────────┼──────────┤
│ 1   │ a        │         │ 2000-01-01 │         │ 2005-01-01 │ 6       │          │ Date     │

but it's missing the other quantiles.

jtrakk avatar Oct 15 '20 18:10 jtrakk