sva-devel icon indicating copy to clipboard operation
sva-devel copied to clipboard

Combat returns NaN

Open vkorobeynyk opened this issue 5 years ago • 6 comments

Hello,

I have been trying to use combat to correct for my batch effect but I have been facing a problem which I cant understand. I have 635 samples and 15000 genes. Those 635 samples are 2 batches pulled together and I waant to correct the 328:635 samples (batch 2) with the reference,1:327 ( batch 1). When I dont use the ref.batch = 1 I get the matrix but when I set this option I always get NaN in the 328:635 columns.

Does someone know why is this happening? I also checked if some cells or genes have var or mean = 0 and I eliminated them (even thought would be good to have them) but still have the same problem.

This is my command: combat_edata = ComBat(dat=data, batch=batchid, par.prior=TRUE, prior.plots=F, mean.only = T, ref.batch = 1)

Another question is: My data is just log normalized from the raw matrix. When I do combat without ref.batch = 1 I get some negative values, why is that? should I just add +1 to all my matrix and then subtract?

Thanks

vkorobeynyk avatar Jun 05 '19 16:06 vkorobeynyk

Hi @Vladyslav3,

Would you try this updated version (https://github.com/zhangyuqing/sva-devel) and see if you still get NaN in the adjusted data? A possible explanation is that some genes in your data have zero variances in a certain batch. Often times these are the genes that have only zero values across samples of a batch. Genes with only zeros can be caused by biological factors (that gene is simply not expressed in this batch) or technical factors (e.g. some experimental conditions that are not present in the other batches), and ComBat is not able to distinguish the source. So in the updated version, we leave these genes unchanged.

For your other question, are you working with log-transformed read counts from sequencing studies, and does the updated version resolve negative values? It may be helpful to check what the original data look like for these negative values. If the problem persists, is it possible for you to send me an example data ([email protected]) for diagnostics?

Thanks, Yuqing

zhangyuqing avatar Jun 05 '19 17:06 zhangyuqing

You were right, i selected genes which had var = 0 in all batches but didnt removed the ones which had null variance inside of each batch bevause I thought they would not count for the batch correction. I also checked and the negative values were just 0 in the original matrix so I just transform all negative values into 0.

Thanks a lot for the help.

vkorobeynyk avatar Jun 06 '19 18:06 vkorobeynyk

Hi @zhangyuqing,

I had the same issue as @Vladyslav3 and tried using your updated version but am getting the following error:

Error in solve.default(crossprod(design), tcrossprod(t(design), as.matrix(dat))) : no right-hand side in 'b'

My command is a bit different though and is as follows:

test <- ComBat(dat=as.matrix(lfq), batch = batch, mod=NULL, par.prior = F)

Thanks for the help!

caitsimop avatar Jul 05 '19 15:07 caitsimop

Hi @caitsimop ,

I have not seen this kind of error before. Would you mind sending me an example dataset (to [email protected]) so that I could run some tests?

If it is not convenient to share the data, would you tell me a bit more information? For example, is it a gene expression dataset? What are the values in the data matrix? What is the dimension of the data (i.e. number of samples and genes)? What are the output messages from ComBat in addition to the error? Is there any gene that has zero variance within any batch (if yes, how many)? Try to share as much about the data as you could, and I'll see what I can do.

Yuqing

zhangyuqing avatar Jul 05 '19 15:07 zhangyuqing

Thanks, @zhangyuqing.

Actually, after reading your comment I figured out the problem was my data...it wasn't properly being transformed into a matrix as it was stored as interger64 and was causing errors!

Thanks for the quick reply!!

caitsimop avatar Jul 05 '19 18:07 caitsimop

Hi @zhangyuqing, I had the same issue as @Vladyslav3 and then tried using your updated version but am getting the following error: Found 2 batches Using null model in ComBat-seq. Error in cbind(batchmod, mod) : number of rows of matrices must match (see arg 2)

hqlkjx avatar Aug 27 '20 10:08 hqlkjx