xpore
xpore copied to clipboard
error in xpore diffmod
Hi,
I got the following error when I ran xpore diffmod
Loading python/3.9.13
Loading requirement: gcc/7.2.0 readline/8.1 curl/7.74.0 libxml2/2.9.1
pcre/8.44.utf8 libpng/1.2.59 sqlite/3.35.3 geos/3.4.2 libtiff/4.0.9
proj/7.2.0 tcltk/8.6.11 CpG-tools/1.1.0
Using the signal of unmodified RNA from /hpf/largeprojects/ccmbio/yliang/long_read_RNA/nanopore_brian/python_venv/lib/python3.9/site-packages/xpore/diffmod/model_kmer.csv
Process Consumer-11:
Traceback (most recent call last):
File "/hpf/largeprojects/ccmbio/yliang/long_read_RNA/nanopore_brian/python_venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3790, in get_loc
return self._engine.get_loc(casted_key)
File "index.pyx", line 152, in pandas._libs.index.IndexEngine.get_loc
File "index.pyx", line 181, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'GCTATGCTC'
This is my yaml for input
data:
MIA:
rep1: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/8327-M2/dataprep
rep2: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/8327-M3/dataprep
rep3: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/Sample1/dataprep
PBS:
rep1: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/4147-M1/dataprep
rep2: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/4147-M2/dataprep
rep3: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/Sample2/dataprep
out: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/diffmod_output
sample1 and sample2 were from R10 flowcell and the rest were from R9 flowcell. Since nanopolish doesn't support R10 data, I used f5c, which supports R10 and R9 and does the same thing as nanopolish, to process all the data. I noticed there are some differences in the eventalign.txt output. In the eventalign output file, the two samples from R10 have the 9 k-mer and the rest R9 data has 5 k-mer.
R10 data eventalign output:
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx
ENSMUST00000103679.2 4 GATAAGGAT 0 t 995 102.35 3.626 0.00350 GATAAGGAT 97.12 3.70 1.23 49450 49464
ENSMUST00000103679.2 5 ATAAGGATT 0 t 996 116.55 6.388 0.00350 ATAAGGATT 111.40 3.22 1.40 49436 49450
R9 data:
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx
ENSMUST00000181768.2 21 AGGTG 0 t 1 108.57 6.003 0.00400 AGGTG 117.25 3.37 -2.28 126162 126174
Would this be the issue of why xpore outputs this error?
Thanks for the help! Laur
Hi Laur (tagging you here @lyj95618),
xpore only support 5mer comparison for now, so 9mer doesn't work.
Thanks!
Best wishes, Yuk Kei
Thank you for your reply and the suggestion in another thread about changing the 9mer to 5mer!
I have one more question about the xPore comparison analysis. Since I am combining data from R9 flowcell and R10(rna004) flowcell, is there a way xPore can adjust for the potential batch effect?
My comparison condition:
Sample A (R9), Sample B (R9), Sample1 (rna004) Vs Sample D (R9), Sample E (R9), Sample2 (rna004)
Thanks, Laur