CellNet icon indicating copy to clipboard operation
CellNet copied to clipboard

cn_apply() Error in predict.randomForest

Open rebekabato opened this issue 3 years ago • 4 comments

Hi Patrick,

I am trying to apply my trained cnProc on my query data by the following piece of code: cnRes <- cn_apply(expList, stQuery, cnProc)

The error message that I get is the following, I cannot figure out what this is related to:

"Error in predict.randomForest(classList[[ctt]], t(expDat[xgenes, ]), type = "prob") : variables in the training data missing in newdata"

I have been using this function successfully on other query data, but now I get this error. I believe this should be about my processor. However, when I trained the processor, everything went well.

Do you have any suggestions on how I could avoid this error and what exactly this is about?

I appreciate your help. Rebeka

rebekabato avatar Dec 15 '20 16:12 rebekabato

I finally figured it out, and it seems the reason why I got this error was that I used the wrong expList data in the cn_apply() function. So not the one that I used for training the processor (I tested 2 different training datasets). I hope this helps others if encounter the same error.

rebekabato avatar Dec 16 '20 10:12 rebekabato

If you could elaborate on how you sorted this out, that would be very helpful. I'm trying to work from the example human data, and having no luck figuring out how to get past this error.

bluedominion avatar Jul 16 '21 18:07 bluedominion

We have now created a web application that takes as input an expression matrix (counts, TPM, or FPKM), and sample meta-data, and performs CellNet analysis. Additionally, this tool includes analysis of many state-of-the-art differentiation protocols, so that you can benchmark your results against those commonly used methds:

https://cahanlab.org/resources/agnosticCellNet_web/

pcahan1 avatar Nov 18 '21 18:11 pcahan1

We run into the same problem. When we reconstruct the CellNet object using the expTrain and the stTrain from the downloaded cnProc (cnProc_HS_RS_Jun_20_2017.rda) object, we get a new cnProc object that fails to execute cn_apply, giving the error:

"Error in predict.randomForest(classList[[ctt]], t(expDat[xgenes, ]), type = "prob") : variables in the training data missing in newdata"

When we run cn_apply with the original, downloaded cnProc object, this error does not occur. @rebekabato rebekabato could you elaborate on what you mean by 'the reason why I got this error was that I used the wrong expList'? Was the expQuery data in the wrong format / datatype? Because that does not seem to be the problem for me (when I use the same expQuery data with the downloaded cnProc_HS_RS_Jun_20_2017.rda object, the error does not occur, only when I run it with by own-made cnProc object).

ImreSchene avatar May 19 '22 13:05 ImreSchene