PatientLevelPrediction icon indicating copy to clipboard operation
PatientLevelPrediction copied to clipboard

Clarification on Creating plpData for Individual Patient Risk Calculation

Open iamalonso opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe Yes, there is a problem I'm facing. When utilizing the PatientLevelPrediction package, I'm encountering difficulties in creating the plpData for a single patient due to the requirement of cohort creation. This complicates the scenario for individual patient risk calculation.

Describe the solution you'd like I would like to have a clearer understanding of how to create the plpData for a single patient without the need for cohort creation. It would be helpful to have guidance on the specific feature vector that needs to be provided as input for the predictPlp() method.

Describe alternatives you've considered I have explored different approaches to creating the plpData, but I have not found a straightforward solution that avoids the complexity of cohort creation. I'm open to alternative suggestions or methods that can simplify the process for calculating risk for a single patient.

Additional context I have reviewed the plpData generated in my use case, but I'm still uncertain about the required feature vector for the predictPlp() method. Any additional guidance or clarifications regarding the creation of plpData for individual patient risk calculation would be greatly appreciated.

iamalonso avatar Jun 21 '23 02:06 iamalonso

Hi @iamalonso,

sorry for the late response. What kind of features do you have?

I made a short snippet here where I develop a model and then use it to predict for one individual. It's kind of hacky. And the complicated thing is to get valid covariateIds for the features you have.

library(PatientLevelPrediction)

data("plpDataSimulationProfile")
plpData <- simulatePlpData(plpDataSimulationProfile, n=3000)

# develop my model
plpResults <- runPlp(plpData = plpData,
                     outcomeId = 2,
                     populationSettings = createStudyPopulationSettings(),
                     modelSettings = setLassoLogisticRegression(seed=42),
                     executeSettings = createDefaultExecuteSettings())

# create new individual
covariateIds <- c(38003563, 36211136251, 4279444751) # covariateIds of his features
rowIds <- c(1, 1, 1)
covariateValue <- c(1, 1, 1) # all binary


newPlpData <- list()
newPlpData$covariateData <- Andromeda::andromeda(covariates=data.frame(rowId=rowIds,
                                                                       covariateId=covariateIds,
                                                                       covariateValue=covariateValue),
                                                 covariateRef=plpData$covariateData$covariateRef, # copy from development data
                                                 analysisRef=plpData$covariateData$analysisRef) # copy from development data
                                                 
class(newPlpData$covariateData) <- "CovariateData"
attr(newPlpData$covariateData, "metaData") <- list(populationSize = 1, cohortId = 1)

predictPlp(plpResults$model, plpData = newPlpData, population = data.frame(rowId=1, targetId=1, ageYear=38, gender=8532))

Does this help?

egillax avatar Aug 28 '23 12:08 egillax