Add Relative Log Expression (RLE) Plots
Reference Issue or PRs
Issue Reference: https://github.com/owkin/PyDESeq2/issues/320
What does your PR implement? Be specific.
The Relative Log Expression (RLE) is a useful diagnostic plot to visualize the differences in count distributions between samples. The x-axis is each sample from the count matrix and the y-axis is the log difference between each gene and the median expression of that gene across all samples.
Given:
gene_ij is the expression of gene j in sample i
median_j is the median expression of gene j across all samples
The RLE for gene j in sample i is calculated as:
RLE_ij = Log2(gene_ij/median_j)
Where:
gene_{ij} is the count of gene j in sample i
median_j is the median count of gene j across all samples
This issue takes in the raw counts self.X, a normalize boolean, design_matrix.index for the sample_ids, and a save_path and produces an RLE plot.
The normalize boolean is set to False by default but can be set to True to normalize the raw counts before plotting
This example was produced with the synthetic data in the ./datasets/ dir
Is there anything specific holding this back?