CHEUI icon indicating copy to clipboard operation
CHEUI copied to clipboard

2 Questions about model 2 and python/c++ preprocess

Open panagi0tis90 opened this issue 11 months ago • 1 comments

Dear authors, Thank you for the useful software. I am facing 2 "problems", so i was thinking if you could help me. My 2 questions are:

  1. I installed a conda env and i performed analysis on the test data you provided to see if all is ok. Although with python_proprocess script the output values were identical with the ones provided, when i run c++ proprocess choice, the probabilities are entirely different (in both model1 and 2). Is there an explanation to this? Maybe i use a different c++ version or something like that?
  2. I run an experiment of 1.3m reads. Initially, i run model2 with default parameters. When i changed the minimum reads to n=10 (i would like to get extra points because in minION many genes are around 10 reads), i got a lot of notifications "Warning: Stoichiometry cannot compute, try less stringent double cutoff values". Is this something to worry about? With default parameters i hardly get any notification like this. Eventually i got more points, but still i dont know if there is something wrong with this.

Thanks in advance!! Pan

panagi0tis90 avatar Mar 06 '24 10:03 panagi0tis90

Hi Pan,

thanks for your email

On Wed, 6 Mar 2024 at 21:35, panagi0tis90 @.***> wrote:

Dear authors, Thank you for the useful software. I am facing 2 "problems", so i was thinking if you could help me. My 2 questions are:

  1. I installed a conda env and i performed analysis on the test data you provided to see if all is ok. Although with python_proprocess script the output values were identical with the ones provided, when i run c++ proprocess choice, the probabilities are entirely different (in both model1 and 2). Is there an explanation to this? Maybe i use a different c++ version or something like that?

Are all probabilities different or just some of them? The C++ preprocessing script rescues additional signals in boundary sites (near the edge of reads or next of unaligned deletions), so I would only expect some of the probabilities to be different

  1. I run an experiment of 1.3m reads. Initially, i run model2 with default parameters. When i changed the minimum reads to n=10 (i would like to get extra points because in minION many genes are around 10 reads), i got a lot of notifications "Warning: Stoichiometry cannot compute, try less stringent double cutoff values". Is this something to worry about? With default parameters i hardly get any notification like this. Eventually i got more points, but still i dont know if there is something wrong with this.

Stoichiometry is estimated from the individual reads aligned to a site, using reads that have P>0.7 (modified) or P<0.3 (non-modified). So reducing the number of reads to 10 may result in running out of available reads for the stoichiometry estimate.

Two possibilities could be

  1. modified the double threshold, e.g. (0.3,0.7)->(0.5,0.5)
  2. work with the individual reads from those sites and assess the sites using the individual read data directly.

I hope this helps

Best

Eduardo

Thanks in advance!! Pan

— Reply to this email directly, view it on GitHub https://github.com/comprna/CHEUI/issues/35, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCZKBYALJMCEWIVXVYZ3HDYW3WPHAVCNFSM6AAAAABEIZG5L2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGE3TCMJXGQYDGMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

EduEyras avatar Mar 15 '24 00:03 EduEyras