HMMRATAC icon indicating copy to clipboard operation
HMMRATAC copied to clipboard

Q: negative standard deviation

Open DNAcastigator opened this issue 4 years ago • 9 comments

Use case Hello, I'm analysing some ATAC seq from different tissue and different samples of 5 instar butterflies. For some of the file the tool works properly but for some reason with others don't.

Describe the problem the error I'm getting is this: ################################# Exception in thread "main" org.apache.commons.math3.exception.NotStrictlyPositiveException: standard deviation (-3.722) at org.apache.commons.math3.distribution.NormalDistribution.(NormalDistribution.java:142) at org.apache.commons.math3.distribution.NormalDistribution.(NormalDistribution.java:107) at org.apache.commons.math3.distribution.NormalDistribution.(NormalDistribution.java:85) at GEMM.HMMR_EM.getWeightedDensity(HMMR_EM.java:170) at GEMM.HMMR_EM.iterate(HMMR_EM.java:74) at GEMM.HMMR_EM.learn(HMMR_EM.java:113) at HMMR_ATAC.Main_HMMR_Driver.main(Main_HMMR_Driver.java:228) ########################## the reference genome (and chromosome size) is the same for all the data.

Describe the solution you tried Previously I got the same error when I was using the wrong size file but in this situation the size is the same that worked for other samples.

DNAcastigator avatar Mar 20 '20 14:03 DNAcastigator

Can you generate a fragment length distribution plot for these datasets? Part of me suspects that there might be datasets with an incomplete fragment distribution, which is causing a negative standard deviation. If this is the case, you could use the --trim option to remove the higher order signals from the analysis. Alternatively, it is possible to skip the EM step all together and instead utilize the parameters generated in successful runs. But the fragment distribution will help diagnose the problem

EvanTarbell avatar Mar 20 '20 14:03 EvanTarbell

LB20.pdf LB21.pdf LB27.pdf LB28.pdf LB29.pdf LB30.pdf Thank you for the answer. Not all the plots seem perfect but unfortunately the use of --trim 3/2/1 didn't solve the problem

DNAcastigator avatar Mar 22 '20 23:03 DNAcastigator

Which of the samples are failing and which ones succeed? and can you attach the log files that are generated?

EvanTarbell avatar Mar 23 '20 13:03 EvanTarbell

E3_FW.pdf E3_HW.pdf E4-FW.pdf E3_FW.log E3_HW.log E3HW.log E3HW2.log E4-FW.log FW-pboy.log LB_20.log LB_21.log LB_27.log OK so, the files in the previous post are all from failed analysis, here you can find the pdf from successuful analysed files and the log file generate from the succesful analysis and some of the failed one.

DNAcastigator avatar Mar 23 '20 15:03 DNAcastigator

So I still think that --trim would solve this problem but it turns out that i dont apply the procedure until after the EM step (ie after the error you are getting). I will work on releasing a new version which applies the --trim option to EM step as well, which should solve your problem. In the meantime, you could take the EM parameters from your successful runs and pass them into HMMR using -m and -s and setting -f False. If they are all from the same organism, this is a reasonable approximation

EvanTarbell avatar Mar 24 '20 15:03 EvanTarbell

perfect, many thanks for the help, I will try this last option while waiting for the next update.

DNAcastigator avatar Mar 24 '20 20:03 DNAcastigator

Hello,

Was this update released? I have the same problem.

JihedC avatar Feb 16 '21 13:02 JihedC

Hi,

I'm having the same problem as well.

angelosarmen avatar May 17 '21 18:05 angelosarmen

hello, any update about this issue, I cannot solve it as well

Antonia112 avatar Apr 13 '22 14:04 Antonia112