posterior icon indicating copy to clipboard operation
posterior copied to clipboard

In summarise draws, a function that returns a character changes all columns to character

Open n-kall opened this issue 1 year ago • 13 comments

If a summary function that returns a character is used in summarise_draws, it makes all columns <chr> which messes up formatting.

example_draws() |>
   summarise_draws(char_fun = \(x) "word", mean)

Returns

# A tibble: 10 × 3
   variable char_fun      mean
   <chr>    <chr>         <chr>
 1 mu       word          4.1799990610081
 2 tau      word          4.16356885610418
 3 theta[1] word          6.74893947963962
 4 theta[2] word          5.25331634990726
 5 theta[3] word          3.04393475572924
 6 theta[4] word          4.85842854299997
 7 theta[5] word          3.22258991784476
 8 theta[6] word          3.98696993634695
 9 theta[7] word          6.5030995214442
10 theta[8] word          4.56520199862817

This could be fixed by using type.convert(out, as.is = TRUE) on the output tibble in summarise_draws_helper

n-kall avatar Mar 14 '24 09:03 n-kall

This suggests to me that the output is being bound into a matrix (which requires a single type) and then converted to a data frame later, which it should probably not be. Rather than potentially round-tripping through a string (with the possibility for information loss) I would suggest binding results together into a data frame directly, which should be more efficient anyway.

mjskay avatar Mar 14 '24 16:03 mjskay

This suggests to me that the output is being bound into a matrix

Yes, this is indeed the case

n-kall avatar Mar 14 '24 16:03 n-kall

I agree. that would make sense

n-kall @.***> schrieb am Do., 14. März 2024, 18:52:

This suggests to me that the output is being bound into a matrix

Yes, this is indeed the case

— Reply to this email directly, view it on GitHub https://github.com/stan-dev/posterior/issues/355#issuecomment-1997904615, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCW2ACGQU7UVGXE5IGYAHDYYHIVHAVCNFSM6AAAAABEVXULC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJXHEYDINRRGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

paul-buerkner avatar Mar 14 '24 16:03 paul-buerkner

It seems this is also caused by the behaviour of unlist, which results in a vector, and when performed on a list with different types inside, will convert them all to the same (in this case character). This unlisting is done to handle summary functions that return multiple elements (e.g. quantile2).

However, I don't know of a base R equivalent to purrr::flatten which would do what we want in this case (returning a list, but without nesting).

n-kall avatar Mar 15 '24 09:03 n-kall

Would unlist(mylist, recursive = FALSE) already to the trick?

paul-buerkner avatar Feb 27 '25 12:02 paul-buerkner

I don't think so, at least initial testing adding recursive = FALSE in the unlist calls doesn't seem to change anything

n-kall avatar Feb 27 '25 13:02 n-kall

What would purrr::flatten do instead? Not sure I get how this would fix things.

paul-buerkner avatar Feb 27 '25 14:02 paul-buerkner

I think purrr::flatten always returns a list, whereas unlist will return a vector even if recursive = F if there are no nested lists

x = list(1, 2,3)

unlist(x, recursive = F)
#> 1 2 3

purrr::flatten(x)
# [[1]] 
# [1] 1
#  [[2]] 
# [1] 2
# [[3]] 
# [1] 3 

n-kall avatar Feb 27 '25 14:02 n-kall

I see. makes sense. and would the use of purrr::flatten fix the issue?

paul-buerkner avatar Feb 27 '25 14:02 paul-buerkner

I think so. I can test tomorrow. Are you thinking to copy the function over to posterior?

n-kall avatar Feb 27 '25 14:02 n-kall

I am not sure yet. If it works, perhaps you can make a PR using purrr::flatten and I will then edit the code to no longer require purrr.

paul-buerkner avatar Feb 27 '25 14:02 paul-buerkner

Sure, I can edit the existing PR

n-kall avatar Feb 27 '25 17:02 n-kall

Or any other approach like direct data.frame conversion, if this is also efficient.

paul-buerkner avatar Feb 27 '25 17:02 paul-buerkner