rtables icon indicating copy to clipboard operation
rtables copied to clipboard

Topic QC: final formatted rtable into computer readable format

Open imazubi opened this issue 2 years ago • 11 comments

Hi @gmbecker ,

I was wondering if there is a way to convert a final formatted rtable into a dataframe writeable in a computer readable format so that the QC can be easily done i.e by:

  • using arsenal::comparedf() in r vs a sas table
  • or either by proc compare in SAS by a SAS programmer.

There are some tables (i.e AE tables) with a considerable size that makes the visual QC difficult. With the current csv output would not be possible to QC programmatically.

my_table <- basic_table() %>% 
  split_rows_by("Species") %>%
  analyze("Petal.Length") %>% 
  build_table(df = iris)

## export csv
export_as_tsv(my_table, "my_table.csv")
image

rtables is an incredible package and I believe the addition of this feature would be an amazing step forward to widely use the package in clinical trials.

imazubi avatar Mar 20 '22 17:03 imazubi

Hi @imazubi,

So there are a few different things here. FIirst off, the output of that csv (actually the path_enriched_df, its not related to writing the file) is bugged. My initial guess is that it has to do with the table only having one column, but I'm not sure yet. Regardless, that will get fixed.

At a higher level, we do have comparison facilities for actual rtables TableTree objects, which (optionally) take structure/pathing into account, so if the QCer is using R, I think that would probably be a better approach. See ?compare_rtables for that functionality as it exists now.

Finally, Part of the roadmap for the work this year that @waddella and I have put together is pushing those comparison/QC funciontalities even further, so if that is a priority please let us know that.

gmbecker avatar Mar 21 '22 05:03 gmbecker

Acute problem with csv export should be fixed (with regression test) int he above commit.

gmbecker avatar Mar 21 '22 05:03 gmbecker

QCing tables has a number of aspects. For comparing rtables tables with tables created with SAS we would need to know the structure of the data frames that are associated with the SAS tables. From my experience those are not generic and differ between different table types. @imazubi would it be possible to provide a couple of data frames from tables created in SAS which are currently used for QCing? Please make sure to use synthetic data.

For comparing different tables created with rtables we have the compare_tables function which as @gmbecker mentioned is on our roadmap to be refactored and enhanced.

For a general QC framework to check consistency among many tables with overlapping results I would suggest to build a path based framework and then use value_at for the comparison. This however requires that the tables are standardized.

Maybe a quick win would be if we also output formatted cells in export_as_tsv. @imazubi would that help? Note though R & SAS have different rounding rules.

waddella avatar Mar 21 '22 09:03 waddella

Hi @waddella and @gmbecker, thanks for these comments. Happy that I helped unintentionally by solving the bug @gmbecker is commenting then :)

We will get back to you with some SAS table soon by the end of april. @waddella however, as you say, I think it a huge quick win to be able to output formatted cells in export_as_tsv, and also adding the column counts when export_as_tsv (I see there is already an issue created for this). In this way, SAS programmers would be even able to read this CSV and have access to the final formatted values to compare.

Let's keep in touch.

imazubi avatar Mar 28 '22 19:03 imazubi

There are issues with putting the counts in in a way that would be machine accessible, they would essentially have to be the first row of observations, which would muddy the waters a great deal in terms of format/meaning of the rest of them.

The bottom line is that an rtable table is a much more information rich object than the tsv/csv format supports. I'm happy to see what we can do in this regard but it is important that understand the limitations of the target format here...

gmbecker avatar Mar 28 '22 19:03 gmbecker

Hi @gmbecker I see, and I totally agree that rtable table is a much more information rich object than the tsv/csv format supports. I was saying this as when QC-ing, the first thing folks check are the column counts. However, as said, I think to be able to output formatted cells in export_as_tsv would be a huge win.

imazubi avatar Apr 05 '22 12:04 imazubi

There are 3 things that I can think of that can happen here:

  1. output values remain as is (the raw values)
  2. output values are now the formatted values
  3. each column is duplicated, essentially, and both are included in the output.

I see downsides to each of these approaches, to be honest, so figuring out what is the right thing to do here is going to take careful thinking and closer collaboration on this feature.

gmbecker avatar Jun 15 '22 21:06 gmbecker

@imazubi Revisiting this. Is this feature still something you need? If so, as I said in my last comment, we need to have some more discussion about exaclty what this feature would do.

If not, I'd like to close the issue for now, please let me know

gmbecker avatar Aug 31 '22 17:08 gmbecker

@gmbecker I will get back to you next week.

imazubi avatar Sep 01 '22 20:09 imazubi

Hi @gmbecker .

  • See here an iris formatted table created by using SAS and exported to a CSV. image
  • Look at the rtable exported to a csv file. image

I feel that at least having the opportunity of obtaining a formated rtable in the computer readable format would be a good starting step to one day, be able to do some programmatic QC by using rtables. What do you guys think?

imazubi avatar Sep 13 '22 14:09 imazubi

Hi @gmbecker,

I realized that by using matrix_form(tt, indent_rownames = TRUE)$strings, then we can write the final formatted table in a computer readable format, so I feel we are good on this side as well.

library(rtables)

adsl <- ex_adsl

my_table <- basic_table() %>%
  split_cols_by("SEX") %>%
  split_rows_by("RACE") %>%
  analyze("AGE")  %>%
  build_table(df = adsl)


my_tab2 <- rtables::matrix_form(my_table, indent_rownames = TRUE)$strings

my_tab_csv <- utils::write.table(
  x = my_tab2,
  file = "my_tab.csv",
  sep = ",",
  col.names = FALSE,
  row.names = TRUE,
  append = FALSE
)

image

imazubi avatar Nov 07 '22 07:11 imazubi

@imazubi this is possible with rtables::as_result_df with as_is = TRUE option you will get the data.frame. This can be easily exported to excel

Melkiades avatar Apr 11 '24 13:04 Melkiades