REDCapTidieR
REDCapTidieR copied to clipboard
[BUG] Misleading Mixed Data Structure Outputs
Expected Behavior
When joining mixed data structures with read_redcap(..., allow_mixed_structure = TRUE)
, we expect tables to be made with meaningful, unique primary keys.
Current Behavior
During development of #199, we found that joining tables that both "repeat together" (i.e. as events, RT) and "repeat separate" (i.e. as instruments, RS) don't provide the appropriate primary keys to allow for distinction of RS rows during joins.
How to Reproduce the Bug:
Using the REDCap for a single record as set up below:
Currently read_redcap()
gives us the following output:
> sprtbl$redcap_data
[[1]]
# A tibble: 4 × 5
record_id redcap_event redcap_form_instance mixed_structure_1 form_status_complete
<dbl> <chr> <dbl> <chr> <fct>
1 1 repeating_together 1 RT1 Complete
2 1 repeating_together 2 RT2 Complete
3 1 repeating_separate 1 RS1 Complete
4 1 repeating_separate 2 RS2 Complete
[[2]]
# A tibble: 3 × 5
record_id redcap_event redcap_form_instance mixed_structure_2 form_status_complete
<dbl> <chr> <dbl> <chr> <fct>
1 1 repeating_together 1 RT1 Complete
2 1 repeating_together 2 RT2 Complete
3 1 repeating_separate 1 RS1 Complete
Using join_data_tibbles()
with a "full join" we get this:
join_data_tibbles(sprtbl, x = "mixed_structure_1", y = "mixed_structure_2", type = "full")
# A tibble: 4 × 8
record_id redcap_event redcap_form_instance mixed_structure_1 redcap_event_instance mixed_structure_2 form_status_complete.x form_status_complete.y
<dbl> <chr> <dbl> <chr> <lgl> <chr> <fct> <fct>
1 1 repeating_together 1 RT1 NA RT1 Complete Complete
2 1 repeating_together 2 RT2 NA RT2 Complete Complete
3 1 repeating_separate 1 RS1 NA RS1 Complete Complete
4 1 repeating_separate 2 RS2 NA NA Complete NA
The issue here is that in row 3, data for mixed_structure_1
and mixed_structure_2
should exist on separate rows because they are RS instances. As read_redcap()
is currently set up, it is impossible to separate these because the primary keys for both are identical (record_id
, redcap_event
, redcap_form_instance
). This is a by product of how we decided to mix redcap_form_instance
s meaning between repeat events and instruments
Solution Proposal
To fix this we will need to do the following:
- Add the repeating event type to the
redcap_events
column of the supertibble. This will need to identify if an event is a RS/RT for us to reference in other functions likejoin_data_tibbles()
- Revise how
redcap_form_instance
is used in these examples for RS rows. This may involve revising howredcap_form_instance
/redcap_event_instance
is defined but boils down to needing an additional primary key to help identify separate repeating instances. @ezraporter after making an example I thinkredcap_form_instance
/redcap_event_instance
might still not address this fully since these share the same event information. I think instead we need to identify RS data and then give it an additional column with the form name it came from. This might be able to get shifted back to the join function's responsibilities.
> tibble::tribble(
+ ~"record_id", ~"redcap_event", ~"redcap_form_instance", ~"redcap_event_instance", ~"extra_rs_key", ~"mixed_structure_1", ~"mixed_structure_2", ~"form_status_complete.x", ~"form_status_complete.y",
+ 1, "repeat_together", 1, NA, NA, "RT1", "RT1", "Complete", "Complete",
+ 1, "repeat_together", 2, NA, NA, "RT2", "RT2", "Complete", "Complete",
+ 1, "repeat_separate", 1, NA, "mixed_structure_1", "RS1", NA, "Complete", NA,
+ 1, "repeat_separate", 1, NA, "mixed_structure_2", NA, "RS1", NA, "Complete",
+ 1, "repeat_separate", 2, NA, "mixed_structure_1", "RS2", NA, "Complete", NA
+ )
# A tibble: 5 × 9
record_id redcap_event redcap_form_instance redcap_event_instance extra_rs_key mixed_structure_1 mixed_structure_2 form_status_complete.x form_status_complete.y
<dbl> <chr> <dbl> <lgl> <chr> <chr> <chr> <chr> <chr>
1 1 repeat_together 1 NA NA RT1 RT1 Complete Complete
2 1 repeat_together 2 NA NA RT2 RT2 Complete Complete
3 1 repeat_separate 1 NA mixed_structure_1 RS1 NA Complete NA
4 1 repeat_separate 1 NA mixed_structure_2 NA RS1 NA Complete
5 1 repeat_separate 2 NA mixed_structure_1 RS2 NA Complete NA
Checklist
Before submitting this issue, please check and verify below that the submission meets the below criteria:
- [x] The issue is atomic
- [x] The issue description is documented
- [x] The issue title describes the problem succinctly
- [x] Developers are assigned to the issue
- [x] Labels are assigned to the issue