SeqMonk icon indicating copy to clipboard operation
SeqMonk copied to clipboard

Having trouble isolating hyper/hypomethylating probes after EdgeR

Open dsimps1993 opened this issue 1 year ago • 3 comments

Hi Simon,

Hope you are doing well! I had a quick question regarding Seqmonk. I've been running differential methylation (EdgeR, replicated, for/rev) on 100bp probes between two sets of replicates. In this example, I get 2988 probes. I want to then get the probes that are hyper and hypomethylating in this list of probes. I tried running a differences filter on individual probes between 0 and 100 for both conditions (HA38vsHA8 and vice versa). In this example, I get 1140 probes in one comparison and 383 in the other. I would expect them both to total 2988 probes, instead I get 1523 probes total. This method does work for a different set of probes, where the total does equal the original number of probes (hence why I initially thought this was the correct method).

Is there an easier or more correct way of getting all hyper/hypomethylating probes after running EdgeR?

Cheers,

Daniel

Picture2 Picture1

dsimps1993 avatar Jan 23 '24 19:01 dsimps1993

My guess is that you may want set it to "Maximum" difference in quantitated value instead of average...

FelixKrueger avatar Jan 23 '24 19:01 FelixKrueger

My guess is that you may want set it to "Maximum" difference in quantitated value instead of average...

Hi Felix,

Thanks for getting back to me so promptly. When I try comparing replicates, the drop down option is greyed out so it can't be changed. If I try comparing the samples (ie not grouped by replicate), and set to maximum, I still get similar results, 400 and 1159 probes, still losing around 1400 probes.

Cheers,

Dan image

dsimps1993 avatar Jan 23 '24 20:01 dsimps1993

Hi Felix,

Think I've solved the issue. The problem is how I defined the probes initially. So I made 100bp probes then filtered for probes with a value between 0 and 100 in at least half the samples in the data, then I ran EdgeR. Then when running the average filter on individual probes, it excludes probes that aren't in the samples queried.

I've repeated the above steps with only probes present in all samples. The filtering for hyper/hypomethylated probes based on average methylation now works (all values total the initial number of probes from EdgeR). I will use this approach from now on.

(I imagine it's better to use probes present in all samples anyway when running EdgeR? How does Seqmonk handle the calculation if certain probes are missing from certain samples?)

Cheers,

Dan

dsimps1993 avatar Jan 23 '24 22:01 dsimps1993