Allow updating a previous PKNCA analysis with changed data
The scenario I'm working with is I want to be able to run an analysis with a full dataset. Then I want to be able to
- Alter the input data of one or more subjects
- Give PKNCA the old analysis the updated dataset
- Have PKNC just rerun those that have changed
- Give me back a new result with the old results merged with the changed ones.
It would be nice if PKNCA could just detect the changes. Alternative I could supply the IDs that I have changed.
I have a few ideas on how this could be done, and it starts to look like, how deep do we want to go down this rabbit hole.
What would make the most sense to me would be something that would work like the update():
update(existing_PKNCAresults, newdata=new_PKNCAdata)
If things other than data for dosing or concentration changed, it would rerun everything. (I don't want to try to track things like changing units or column names.)
Then, it could redo grouping concentrations, times, and intervals and check those for changes between the old and new data. Where data (conc, dose, or intervals) changed, update the results. Where data are unchanged, keep the existing results.
Does that sound like what you're thinking of?
Yeah that sounds correct. In the scenario I'm thinking of the analysis details (units, intervals, parameters requested, etc.) do not change. The only think that changes is the observation data. Specifically I'm changing the data flags (censoring, half-life inclusion, half-life exclusion) and I want to be able to see how those change things like r-squared and half-life. I can pretty easily supply user ids if that makes it easier so you just redo everything for those specific users.
When I create functions to process the PKNCA results I typically just pass the output of pk.nca(). For example I'll do something like the following:
nca_res = PKNCA::pk.nca(nca_data)
I will normally access the original data like this:
ana_ds = nca_res[["data"]][["conc"]][["data"]]
Is that bad? I don't know :). However when I run update
nca_res_ud = update(nca_res, data=new_data)
The "conc" field seems to be gone in nca_res_ud.
Is that the intended behavior?
That is not intended behavior. I'll take a look.
And, the preferred way to do your nested finding of the concentration data would be
as.data.frame(as_PKNCAconc(nca_res))
Somewhat related I'm doing the following to pull out column information. Is there a function for this as well. Sorry
Extracting the needed column names:
col_id = nca_res[["data"]][["conc"]][["columns"]][["subject"]] col_time = nca_res[["data"]][["conc"]][["columns"]][["time"]] col_conc = nca_res[["data"]][["conc"]][["columns"]][["concentration"]] col_analyte = nca_res[["data"]][["conc"]][["columns"]][["groups"]][["group_analyte"]]
I'm playing around with this. Here is a reproducable example. First you'll have to install the development version of these packages:
devtools::install_github("john-harrold/onbrand", dependencies=TRUE)
devtools::install_github("john-harrold/formods", dependencies=TRUE)
devtools::install_github("john-harrold/ruminate", dependencies=TRUE)
And this shows how I'm using it:
library("dplyr")
library("janitor")
library("readxl")
library("rio")
library("tidyr")
library("PKNCA")
library("ruminate")
# Metadata about NCA parameters
NCA_nps = NCA_fetch_np_meta()[["summary"]]
xls_file =system.file(package="formods", "test_data", "TEST_DATA.xlsx")
# Loading source data
myDS <- rio::import(file=xls_file, which="DATA")
myDS <- janitor::clean_names(myDS, case="none")
# Data wrangling
# PK 3mg SD IV (NCA)
DW_myDS_3<-myDS
DW_myDS_3 <- dplyr::filter(DW_myDS_3,EVID == 0)
DW_myDS_3 <- dplyr::filter(DW_myDS_3,Cohort %in% c("SD 3 mg IV"))
DW_myDS_3 <- dplyr::filter(DW_myDS_3,CMT %in% c("C_ng_ml"))
# PK Example ==============================================================
# Setting the NCA options
PKNCA::PKNCA.options(
adj.r.squared.factor = 1e-04,
max.missing = 0.5,
auc.method = "lin up/log down",
conc.na = "drop",
first.tmax = TRUE,
allow.tmax.in.half.life = FALSE,
min.hl.points = 3,
min.span.ratio = 2,
min.hl.r.squared = 0.9,
max.aucinf.pext = 20,
conc.blq = list(
first = "keep",
middle = "drop",
last = "keep"
)
)
# Creating a copy of the source dataset to use below
NCA_1_DS = DW_myDS_3
# Creating column mapping
NCA_1_col_map = list(
col_id = c("ID"),
col_dose = c("DOSE"),
col_conc = c("DV"),
col_dur = NULL,
col_analyte = NULL,
col_route = c("ROUTE"),
col_time = c("TIME_DY"),
col_ntime = c("NTIME_DY"),
col_group = NULL,
col_evid = NULL,
col_cycle = c("DOSE_NUM")
)
# This flags specific rows in the source dataset based on the row number. See
# flag_map for possible flags (those that include manual="yes"). To remove a
# flag just delete the row.
NCA_1_ds_flags = tribble(
~key, ~flag, ~note,
# "rec_17", "hlex", "" ,
# "rec_18", "hlex", "" ,
# "rec_49", "hlex", "" ,
"rec_77", "hlin", "" ,
"rec_78", "hlin", "" ,
"rec_79", "hlin", "" ,
"rec_37", "hlin", "" ,
"rec_38", "hlin", "" ,
"rec_39", "hlin", "" ,
"rec_56", "censor", "" ,
"rec_26", "censor", ""
)
# This contains flags used in functions below.
# - manual indicates that it can be assinged with manual point selection.
# This SHOULD NOT BE CHANGED
# - flag internal flag used (SHOULD NOT BE CHANGED).
# - description short description used in figure legends
# - sn short name used in tables
# - notes notes that go at the bottom of tables
NCA_1_flag_map = list(
reset = list(
manual = "yes",
color = "black",
sn = "NF",
description = "Clear Flagging",
notes = "reset point/selection to dataset default"
),
obs = list(
manual = "no",
color = "black",
sn = "OBS",
description = "Observation",
notes = "normal observation"
),
blq = list(
manual = "no",
color = "#56B4E9",
sn = "BLQ",
description = "BLQ",
notes = "below the level of quantification"
),
hlex = list(
manual = "yes",
color = "#F0E441",
sn = "HX",
description = "Exclude Half-life",
notes = "exclude from half-life calculation"
),
hlin = list(
manual = "yes",
color = "#029E73",
sn = "HI",
description = "Specify Half-life",
notes = "specified for use in half-life calculation"
),
censor = list(
manual = "yes",
color = "#D65E00",
sn = "C",
description = "Exclude",
notes = "excluded from NCA analysis"
),
ns = list(
manual = "no",
color = "black",
sn = "NS",
description = "Not Sampled",
notes = "not calculated"
),
nc = list(
manual = "no",
color = "black",
sn = "NC",
description = "Not Calclated",
notes = "not calculated"
)
)
# Adds unique keys and flagging information to the dataset
NCA_1_DS = flag_nca_ds(
DS = NCA_1_DS,
flag_map = NCA_1_flag_map,
col_map = NCA_1_col_map,
ds_flags = NCA_1_ds_flags)
# Defining patterns to look for different ways routes were specified.
NCA_1_route_map = list(
intravascular = c("^(?i)iv$"),
extravascular = c("^(?i)sc$", "^(?i)oral")
)
# This will apply the route patterns specified above.
NCA_1_DS = apply_route_map(
route_map = NCA_1_route_map,
route_col = "ROUTE",
DS = NCA_1_DS)
# Extracting dosing records from the dataset
NCA_1_dose_rec = dose_records_builder(
NCA_DS = NCA_1_DS,
col_map = NCA_1_col_map,
dose_from = "cols")[["dose_rec"]]
# NCA dosing object
NCA_1_dose = PKNCA::PKNCAdose(NCA_1_dose_rec, DOSE~TIME_DY|ID, route = "ROUTE")
# NCA concentration object
NCA_1_conc = PKNCA::PKNCAconc(
data = NCA_1_DS,
formula = DV~TIME_DY|ID,
time.nominal = "NTIME_DY",
exclude_half.life = "rmnt_hlex",
include_half.life = "rmnt_hlin",
exclude = "rmnt_cens",
sparse = FALSE
)
# NCA units table
NCA_1_units = PKNCA::pknca_units_table(
concu = "ng/mL",
doseu = "mg",
amountu = "mg",
timeu = "day"
)
# Dataframe containing the analysis intervals
NCA_1_intervals =
data.frame(
start = c(0, 0),
end = c(Inf, 21),
aucinf.obs = c(TRUE, FALSE),
cmax = c(FALSE, TRUE),
auclast = c(FALSE, TRUE)
)
# Pulling everything together to create the data object.
NCA_1_data = PKNCA::PKNCAdata( data.conc = NCA_1_conc,
data.dose = NCA_1_dose,
intervals = NCA_1_intervals,
units = NCA_1_units)
# Running the NCA
NCA_1_res = PKNCA::pk.nca(NCA_1_data)
# Adding some flags and removing others:
NCA_1_ds_flags = tribble(
~key, ~flag, ~note,
"rec_17", "hlex", "" ,
"rec_18", "hlex", "" ,
"rec_49", "hlex", "" ,
"rec_77", "hlin", "" ,
"rec_78", "hlin", "" ,
"rec_79", "hlin", "" ,
"rec_37", "hlin", "" ,
"rec_38", "hlin", "" ,
# "rec_39", "hlin", "" ,
# "rec_56", "censor", "" ,
"rec_26", "censor", ""
)
# Adds unique keys and flagging information to the dataset
NCA_1_DS = flag_nca_ds(
DS = NCA_1_DS,
flag_map = NCA_1_flag_map,
col_map = NCA_1_col_map,
ds_flags = NCA_1_ds_flags)
# Defining patterns to look for different ways routes were specified.
NCA_1_route_map = list(
intravascular = c("^(?i)iv$"),
extravascular = c("^(?i)sc$", "^(?i)oral")
)
# This will apply the route patterns specified above.
NCA_1_DS = apply_route_map(
route_map = NCA_1_route_map,
route_col = "ROUTE",
DS = NCA_1_DS)
# Extracting dosing records from the dataset
NCA_1_dose_rec = dose_records_builder(
NCA_DS = NCA_1_DS,
col_map = NCA_1_col_map,
dose_from = "cols")[["dose_rec"]]
# NCA dosing object
NCA_1_dose = PKNCA::PKNCAdose(NCA_1_dose_rec, DOSE~TIME_DY|ID, route = "ROUTE")
# NCA concentration object
NCA_1_conc = PKNCA::PKNCAconc(
data = NCA_1_DS,
formula = DV~TIME_DY|ID,
time.nominal = "NTIME_DY",
exclude_half.life = "rmnt_hlex",
include_half.life = "rmnt_hlin",
exclude = "rmnt_cens",
sparse = FALSE
)
# NCA units table
NCA_1_units = PKNCA::pknca_units_table(
concu = "ng/mL",
doseu = "mg",
amountu = "mg",
timeu = "day"
)
# Dataframe containing the analysis intervals
NCA_1_intervals =
data.frame(
start = c(0, 0),
end = c(Inf, 21),
aucinf.obs = c(TRUE, FALSE),
cmax = c(FALSE, TRUE),
auclast = c(FALSE, TRUE)
)
# Pulling everything together to create the data object.
NCA_1_data = PKNCA::PKNCAdata( data.conc = NCA_1_conc,
data.dose = NCA_1_dose,
intervals = NCA_1_intervals,
units = NCA_1_units)
# Running the NCA
NCA_1_res_ud = update(NCA_1_res, data=NCA_1_data)
old_res = as.data.frame(as_PKNCAconc(NCA_1_res))
new_res = as.data.frame(as_PKNCAconc(NCA_1_res_ud))
I fixed the issue you found and added documentation of it to a vignette here: https://humanpred.github.io/pknca/articles/v01-introduction-and-usage.html#updating-existing-results