pointblank
pointblank copied to clipboard
clinically-oriented codeboook/data dictionary
Proposal
Hi Rich - Just a particular viewpoint - it can be really helpful to be able to generate a codebook for clinical data in a semi-automated fashion, a lot like a REDCap codebook redcap This is a bit of a wish list, but here goes:
- make a tibble that can be exported to things like excel a. option to make pretty pdf, HTML, Word Rmd if desired.
- include columns for a. variable_name b. variable_pretty (still short, but with spaces, title case for tables) - may require user to enter if no labels c. variable description (long & detailed) - likely to require user to enter d. for numeric - mean, median, range e. for factors - each level f. for all vars - % missing
- somehow handle labels coming in from SAS, SPSS, Stata a. one suggestion - link values to value labels, i.e. values 0-3, with labels "none", "mild", "moderate", "severe", as 0_none, 1_mild, 2_moderate, 3_severe b. pull in variable labels - possibly as variable_pretty
Can have more detail - see data dictionary from REDCap Details on an example of a standard REDCap data dictionary can be found here : https://www.utsouthwestern.edu/edumedia/edufiles/about_us/admin_offices/academic_information_services/redcap-database-creating-dictionary.pdf
I hope that this is helpful. Could be a function like make_codebook() or make_data_dictionary()
Peter, this is very helpful, thank you for providing these detailed requirements! I may need your help in the near future with obtaining SAS-, SPSS-, Stata-based inputs.
Sure - anytime via email or twitter at @ibddoctor On Friday, July 2, 2021, 11:06:48 AM PDT, Richard Iannone @.***> wrote:
Peter, this is very helpful, thank you for providing these detailed requirements! I may need your help in the near future with obtaining SAS-, SPSS-, Stata-based inputs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
lots of challenges with labels. I tend to link them ( 0_failed, 1_success) to make sure they don't get confused or lost/flipped, as happened in the JAMA asthma intervention trial (swapped 0 for 1 for their response variable - got it backward - interpreted the intervention as helpful when it was actually harmful - had to withdraw the paper a year later when this was discovered by someone doing a secondary analysis) Peter On Friday, July 2, 2021, 03:13:16 PM PDT, Peter Higgins @.***> wrote:
Sure - anytime via email or twitter at @ibddoctor On Friday, July 2, 2021, 11:06:48 AM PDT, Richard Iannone @.***> wrote:
Peter, this is very helpful, thank you for providing these detailed requirements! I may need your help in the near future with obtaining SAS-, SPSS-, Stata-based inputs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Once they are linked, you can recover the numeric version with parse_number, and you can recover the text label with str_sub(3,-1) - though it would be nice to wrap this in a nicer function name Peter On Friday, July 2, 2021, 03:15:32 PM PDT, Peter Higgins @.***> wrote:
lots of challenges with labels. I tend to link them ( 0_failed, 1_success) to make sure they don't get confused or lost/flipped, as happened in the JAMA asthma intervention trial (swapped 0 for 1 for their response variable - got it backward - interpreted the intervention as helpful when it was actually harmful - had to withdraw the paper a year later when this was discovered by someone doing a secondary analysis) Peter On Friday, July 2, 2021, 03:13:16 PM PDT, Peter Higgins @.***> wrote:
Sure - anytime via email or twitter at @ibddoctor On Friday, July 2, 2021, 11:06:48 AM PDT, Richard Iannone @.***> wrote:
Peter, this is very helpful, thank you for providing these detailed requirements! I may need your help in the near future with obtaining SAS-, SPSS-, Stata-based inputs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
in theory, 3 helper functions1. link_label() - links stored number to label, as in (0_no, 1_yes), and also makes variable into a factor, with levels ordered by the number.2. parse_number() to pull from the linked label (1_female, 2_male), the original integer, formatted as integer to allow doing math3. pretty_label() to pull from the linked label (0_none, 1_mild, 2_moderate, 3_severe), the original label and put into title case, as a factor with levels ordered by the original number. For making pretty tables and graph axis labels. On Friday, July 2, 2021, 03:17:05 PM PDT, Peter Higgins @.***> wrote:
Once they are linked, you can recover the numeric version with parse_number, and you can recover the text label with str_sub(3,-1) - though it would be nice to wrap this in a nicer function name Peter On Friday, July 2, 2021, 03:15:32 PM PDT, Peter Higgins @.***> wrote:
lots of challenges with labels. I tend to link them ( 0_failed, 1_success) to make sure they don't get confused or lost/flipped, as happened in the JAMA asthma intervention trial (swapped 0 for 1 for their response variable - got it backward - interpreted the intervention as helpful when it was actually harmful - had to withdraw the paper a year later when this was discovered by someone doing a secondary analysis) Peter On Friday, July 2, 2021, 03:13:16 PM PDT, Peter Higgins @.***> wrote:
Sure - anytime via email or twitter at @ibddoctor On Friday, July 2, 2021, 11:06:48 AM PDT, Richard Iannone @.***> wrote:
Peter, this is very helpful, thank you for providing these detailed requirements! I may need your help in the near future with obtaining SAS-, SPSS-, Stata-based inputs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.