tern
tern copied to clipboard
[Feature Request]: different factor levels for different split contexts
Feature description
similar to https://github.com/insightsengineering/chevron/pull/343 where different precisions are needed, we have similar cases that, when splitting by "parameters", the levels need to be adjusted.
library(dplyr)
data <- expand.grid(
USUBJID = c("1", "2", "3"),
ARM = c("arm a", "arm b"),
PARAMCD = c("test a", "test b", "test c"),
AVISIT = c("VISIT 1", "VISIT 2", "VISIT 3")
) %>%
mutate(
AVALCAT = case_when(
PARAMCD == "test a" ~ sample(c("high", "low"), size = n(), replace = TRUE),
PARAMCD == "test b" ~ sample(c("normal", "abnormal"), size = n(), replace = TRUE),
PARAMCD == "test c" ~ sample(c("high", "middle", "low"), size = n(), replace = TRUE),
TRUE ~ sample(c("high", "low"), size = n(), replace = TRUE),
)
) %>%
df_explicit_na
basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by("PARAMCD") %>%
split_rows_by("AVISIT") %>%
summarize_vars("AVALCAT") %>%
build_table(data)
even if we there are no actual categories in that parameter, the levels are there; pruning is removing all zeros so maybe not optimal if we want to keep some 0 count rows. Best if it could be achieved in analysis part.
Looping @barnett11 for your comments. This arise up when I review a template of egt05_qtcat_1 where Tim asks if by parameter analysis is possible
Code of Conduct
- [X] I agree to follow this project's Code of Conduct.
Contribution Guidelines
- [X] I agree to follow this project's Contribution Guidelines.
Security Policy
- [X] I agree to follow this project's Security Policy.
Hi @clarkliming - yes this will definitely be needed once we get into further TA specific templates. Some examples include SSRST01, PROT03 and SLAMPT01
then @khatril please consider this in the next increment
I have the feeling that this is very template specific. The solution would be a switcher (more or less complicated, depending on template) inside the analysis function based on the split_context. At this stage, also selected empty levels can be dropped with NULL
. Eventually, they can be selectively trimmed afterward.
will trim_levels_in_group
or trim_levels_to_map
work here?
I would personally focus on the analysis function as @barnett11 mentioned that it is also important to have a specific order of the output which may change for different split contexts. Surely trimming beforehand could simplify the analysis function
#trim_levels_to_map
is specifically designed to do exactly this thing (including leaving desired 0s in while removing undesired level combinations, even when they are not unobserved)
The map also controls the ordering of the levels:
data <- data.frame(LBCAT = c(rep("a", 4), rep("b", 4)),
PARAM = c(rep("param1", 4), rep("param2", 4)),
VISIT = rep(c("V1", "V2"), 4),
ABN = rep(c("H", "L"), 4),
stringsAsFactors = TRUE)
map <- data.frame(LBCAT = c(rep("a", 4), rep("b", 4)),
PARAM = c(rep("param1", 4), rep("param2", 4)),
VISIT = rep(c("V1", "V1", "V2", "V2"), 2),
ABN = rep(c("H", "L"), 4),
stringsAsFactors = FALSE)
tbl <- basic_table() %>%
split_rows_by("LBCAT", split_fun = trim_levels_to_map(map = map)) %>%
split_rows_by("PARAM", split_fun = trim_levels_to_map(map = map)) %>%
split_rows_by("VISIT", split_fun = trim_levels_to_map(map = map)) %>%
analyze("ABN") %>%
build_table(df = data)
> map
LBCAT PARAM VISIT ABN
1 a param1 V1 H
2 a param1 V1 L
3 a param1 V2 H
4 a param1 V2 L
5 b param2 V1 H
6 b param2 V1 L
7 b param2 V2 H
8 b param2 V2 L
> tbl
all obs
——————————————————
a
param1
V1
H 2
L 0
V2
H 0
L 2
b
param2
V1
H 2
L 0
V2
H 0
L 2
Vs
map2 <- data.frame(LBCAT = c(rep("a", 4), rep("b", 4)),
PARAM = c(rep("param1", 4), rep("param2", 4)),
VISIT = rep(c("V2", "V2", "V1", "V1"), 2),
ABN = rep(c("H", "L", "L", "H"), 2),
stringsAsFactors = FALSE)
tbl2 <- basic_table() %>%
split_rows_by("LBCAT", split_fun = trim_levels_to_map(map = map2)) %>%
split_rows_by("PARAM", split_fun = trim_levels_to_map(map = map2)) %>%
split_rows_by("VISIT", split_fun = trim_levels_to_map(map = map2)) %>%
analyze("ABN") %>%
build_table(df = data)
> map2
LBCAT PARAM VISIT ABN
1 a param1 V2 H
2 a param1 V2 L
3 a param1 V1 L
4 a param1 V1 H
5 b param2 V2 H
6 b param2 V2 L
7 b param2 V1 L
8 b param2 V1 H
> tbl2
all obs
——————————————————
a
param1
V2
H 0
L 2
V1
L 0
H 2
b
param2
V2
H 0
L 2
V1
L 0
H 2
Note the different orders of H
, and L
for the V1
and V2
parent facets. I believe this solves the stated issue out of the box. Am I missing something? @barnett11 @Melkiades
I think this is a viable option but it could be a bit cumbersome for longer tables. In my opinion, the best solution is to wrap prune_empty_level
in a way that keeps selected lines. The problem is how I feed the selected lines (aka path) when the pruning function does not see the whole row path of the element that is analyzed. In other words, if I get to the DataRow
element I can know its label but not the whole path that could be the one "to keep". I guess my question for @gmbecker is if there is a way to see always the row path in each pruning iteration of tt
. Otherwise, is there a way to do what I intend here?
This is my tentative solution:
library(dplyr)
library(tern)
library(rtables)
data <- expand.grid(
USUBJID = c("1", "2", "3"),
ARM = c("arm a", "arm b"),
PARAMCD = c("test a", "test b", "test c"),
AVISIT = c("VISIT 1", "VISIT 2", "VISIT 3")
) %>%
mutate(
AVALCAT = case_when(
PARAMCD == "test a" ~ sample(c("high", "low"), size = n(), replace = TRUE),
PARAMCD == "test b" ~ sample(c("normal", "abnormal"), size = n(), replace = TRUE),
PARAMCD == "test c" ~ sample(c("high", "middle", "low"), size = n(), replace = TRUE),
TRUE ~ sample(c("high", "low"), size = n(), replace = TRUE),
)
) %>%
df_explicit_na
data
mysplitfun <- make_split_fun(pre = list(drop_facet_levels),
post = list(add_overall_facet("ALL", "All Arms")))
a_tbl <- basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by("PARAMCD") %>%
split_rows_by("AVISIT") %>%
summarize_vars("AVALCAT") %>%
build_table(data)
a_tbl
my_personal_pruning <- function(to_keep) {
function (tt) {
if(is(tt, "DataRow"))
browser()
print(row_paths(tt)[1])
if(row_paths(tt)[1] %in% to_keep)
return(FALSE)
rtables::prune_empty_level(tt)
}
}
prune_table(a_tbl, prune_func = my_personal_pruning(list(c("PARAMCD", "test a"))))