distributions3
distributions3 copied to clipboard
Rounding/nicer formatting for print methods
Down the road we might want to consider how to format numbers coming out of print methods for the distributions. In this example
https://github.com/alexpghayes/distributions/pull/5#issuecomment-493255829
we get something like
fit(normal(), rnorm(100))
#> normal distribution (mu = -0.0962394013206253, sigma = 1.09850673011021)
fit(exponential(), rexp(100))
#> Exponential distribution (rate = 0.827082214289625)
where I feel like displaying < 15 digits might be a little overwhelming, especially if we are intending for this to be accessible for new users.
I was going to mention this myself. Is there a system/user significant digits variable that can be honoured?
There's round()
but I think that's not ideal. I know the tidyverse crew has dealt with this for tibble printing, so maybe we can root around in there?
I had a quick look there and it looks like the tidyverse function lives in the r-lib/pillar. I think it's this file with the format_decimal
function. It's quite involved. round()
might be a good solution for the time being.
Why not provide a format()
method that formats the parameters prior to including them in a string? And then the print()
method could simply print the formatted string. For example for the normal distribution:
format.Normal <- function(x, digits = max(1L, getOption("digits") - 3L), ...) {
sprintf("Normal(mu = %s, sigma = %s)",
format(x$mu, digits = digits, ...),
format(x$sigma, digits = digits, ...)
)
}
print.Normal <- function(x, digits = max(1L, getOption("digits") - 3L), quote = FALSE, ...) {
print(format(x, digits = digits, ...), quote = FALSE)
invisible(x)
}
(Note: This follows some base R conventions regarding getOptions()
and print()
methods returning their first argument unmodified invisibly.)
With this you can do:
R> d <- Normal(0, sqrt(1:4))
R> d
[1] Normal(mu = 0, sigma = 1.000) Normal(mu = 0, sigma = 1.414)
[3] Normal(mu = 0, sigma = 1.732) Normal(mu = 0, sigma = 2.000)
Note that the same precision is used for all mus and separately for all sigmas but not jointly. This would be my preference but one could also arrange that the same precision is used across all parameters.
This was addressed in the vectorization of distributions (#71). We now have:
set.seed(0)
fit_mle(Normal(), rnorm(100))
## [1] "Normal distribution (mu = 0.02267, sigma = 0.8827)"
Normal(0, sqrt(1:4))
## [1] "Normal distribution (mu = 0, sigma = 1.000)"
## [2] "Normal distribution (mu = 0, sigma = 1.414)"
## [3] "Normal distribution (mu = 0, sigma = 1.732)"
## [4] "Normal distribution (mu = 0, sigma = 2.000)"