scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

Multibatchnorm / between library batch normalization

Open Marwansha opened this issue 1 year ago • 2 comments

What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

Please describe your wishes

Hi,

Is there an equivalent function to multiBatchNorm in Python, or another method that can perform per-batch normalization?

My goal is to compute psuedobulk per indiviudal, Each individual sample has replicates that are processed across different libraries,

a- Simply summing the raw counts across replicates would likely introduce bias due to library-specific batch effects.

b- Taking the mean of normalized counts across replicates (scranPY normalized counts) doesn’t account for differences in size factors across the libraries, making normalization inconsistent between batches.

important note : replicates are distributed across different libraries

Individual x might have replicate 1 in library 1 and replicate 2 in library 3, while Individual y might have replicate 1 in library 1 but replicate 2 in library 4. so thats why summing raw / normalized counts directly seem inaccurate

I’d greatly appreciate any advice.

In R, I’ve previously used multiBatchNorm from the scran package, which normalizes and scale the size factors within each batch to handle such batch effects. However, given the size of my current dataset, using R is not feasible.

Marwansha avatar Oct 23 '24 12:10 Marwansha

We talk a little about batch correction here: https://scanpy.readthedocs.io/en/latest/api/preprocessing.html#batch-effect-correction

@AnnaChristina @Zethson what’s the best practice take on this?

flying-sheep avatar Dec 16 '24 12:12 flying-sheep

Thanks will take a look.

I am mainly interested in multi batch norm like function, to scale the scran size factors across batches

https://rdrr.io/bioc/batchelor/man/multiBatchNorm.html

And also I am wondering if there is a equivalent in python to the edgeR CPM normalisation adjusted for library sizes ?

https://rdrr.io/bioc/edgeR/man/cpm.html

Thanks a lot for your time and help Marwan

Marwansha avatar Dec 16 '24 12:12 Marwansha