isONcorrect icon indicating copy to clipboard operation
isONcorrect copied to clipboard

An issue with the installation/environment

Open lauraht opened this issue 2 years ago • 5 comments

Hello!

I have an issue with the isONcorrect installation and would appreciate your advice.

Last November, I installed isONcorrect using conda (and created the isoncorrect env) by following your README instructions, and I was able to run isONcorrect successfully. Now, I wanted to run isONcorrect again, so I just used:

conda activate isoncorrect

However, isONcorrect --help now gave me the following error:

Traceback (most recent call last):
  File "~/miniconda3/envs/isoncorrect/bin/isONcorrect", line 20, in <module>
    from modules import create_augmented_reference, help_functions, correct_seqs #,align
ModuleNotFoundError: No module named 'modules'

I found that modules and isONcorrect-0.0.8.dist-info are located under ~/miniconda3/envs/isoncorrect/lib/python3.10/site-packages. But under ~/miniconda3/envs/isoncorrect/lib/python3.7/site-packages, there is no modules and no isONcorrect-0.0.8.dist-info. However, ~/miniconda3/envs/isoncorrect/bin has only python3.7, no python3.10.

What puzzles me is that isONcorrect used to work fine when I first installed it.

I was wondering if you have any ideas about what may be wrong with my installation or environment? I would really appreciate your advice.

Thank you very much in advance!

lauraht avatar Jun 11 '22 16:06 lauraht

Hi @lauraht,

Not exactly. I would recommend uninstalling the isoncorrect environment and reinstall it from scratch.

conda deactivate isoncorrect
conda env remove -n isoncorrect
then reinstall..

If that doesn't work we could take it from there.

Let me know how it goes!

ksahlin avatar Jun 11 '22 16:06 ksahlin

Hi Kristoffer,

Thank you so much for your advice!

I removed the isoncorrect env and then reinstalled isONcorrect. Now isONcorrect works fine!

I have another question about isONcorrect and would appreciate your advice: (1) For those singleton reads (belonging to size-1 clusters), isONcorrect does not perform any error correction on them, right? (2) For those reads belonging to size-2 clusters, it seems that for some of them, isONcorrect performs error correction (i.e. reads sequences are changed compared to the original reads). I was wondering how spoa generates the consensus from only two reads. If read-1 has a base ‘A’ while read-2 has a base ‘T’ at the same position, how would isONcorrect determine whether it is read-1 or read-2 that has the error (since there are only two reads in the cluster)? Similarly, if read-1 misses a base compared to read-2, how would we know whether it is a deletion in read-1 or actually an insertion in read-2?

Thank you very much for your help!

lauraht avatar Jun 13 '22 16:06 lauraht

As for Q1 the answer is no.

As for Q2, I thought we set the threshold to minimum of 3 reads. So I am surprised to hear that you have observed correction to 2-reads clusters. I don't want to doubt you (and it was a long time ago I implemented it), but you could probably check again whether this is true (because this line is supposed to stop anything lower than 3 reads from being corrected).

Your question can still be answered about what happens to the spoa consensus: In terms of A/T - I don't know. In terms of an indel - spoa will choose the longer path (insertion).

However, isONcorrect will not simply use the spoa consensus as the corrected version of the read. isONcorrect will remap all read-segments to the spoa consensus segment generated from the read-segments. Then it will infer the allele frequency of the particular variant (SNP/indel) and correct the position in the read-segment only if its frequency is lower than a certain frequency threshold (default is lower than 10% frequency, with a hard lower occurrence of 3, seed this line.

ksahlin avatar Jun 13 '22 19:06 ksahlin

Hi Kristoffer,

Thank you so much for your explanations!

Just to confirm, for Q1, when you say “the answer is no”, you mean isONcorrect does not perform any error correction on singleton reads, is that right?

About 2-reads clusters, I used the diff command between the original fastq file and the corrected fastq file of a cluster. I found that only in a small fraction of 2-reads clusters, the diff command reported the difference on the sequence line (line 2 and line 6).

Thank you very much again!

lauraht avatar Jun 16 '22 15:06 lauraht

Yes, isONcorrect does not perform any error correction on singleton reads.

I see. I don’t know why tbh. From my memory and from looking at the code, this should not happen. If you have an example where two sequences are input and changed after I could do a bug search for this.

ksahlin avatar Jun 18 '22 00:06 ksahlin