xpore
xpore copied to clipboard
dataprep result is empty
hello developer!
i ran xpore dataprep wiht my direct RNAseq data generated with SQK-RNA004 kit
but the output file is empty and i cannot identify what the problem is..
can you advise me about this problem..?
Hi @Seongmin-Jang-1165,
Can you provide the following, please?
first 10 lines from xpore dataprep eventalign_index
head eventalign.index
first 10 lines from nanopolish eventalign.txt
head eventalign.txt
first 10 lines from the gtf file
head [annotation.gtf]
Thanks!
Best wishes, Yuk Kei
Updated at 23rd Dec.
Hi, I have solved the following error by running cmd: python -m xxx.xxx.xxx(pth of xpore code).xpore dataprep --eventalign eventalign_file.txt --out_dir output_pth
============================================================================
Hi, I have come across the same question:
I tried the following steps before running the cmd xpore-dataprep:
dataset: mm39, WT and KO, here use ko as an example
pre-processing:
1. multi-fast5 to single-fast5:
multi_to_single_fast5 -i demo/guppy -s demo/guppy_single -t 40 --recursive
2. basecalling:
guppy_basecaller -i /data/fast5_data/mm_WT/single_fast5/ -s ko.guppy/ --config ~/nanopore_methods/ont-guppy/data/rna_r9.4.1_70bps_hac.cfg -r --num_callers 4 --cpu_threads_per_caller 2 --device auto
cat ko.guppy/pass/*.fastq > ko.fastq
3. minimap2 generates .sam file:
minimap2 -ax map-ont -k 14 GRCm38.transcripts.fa -t 25 --secondary=no /data/fast5_data/mm_KO/ko.fastq -o /data/fast5_data/mm_KO/ko.sam
4. minimap generates .bam file:
samtools view -@ 30 -F 2048 -F 4 -b ko.sam | samtools sort -O BAM -@ 20 -o ko.bam
samtools index -@ 16 ko.bam # generate index
5. nanopolish
first generate index: nanopolish index -d <PATH/TO/FAST5_DIR> <PATH/TO/FASTQ_FILE>
nanopolish index -d single_fast5/ ko.fastq > index.log 2>&1
then eventalign:
nanopolish eventalign --read ko.fastq \
--bam ko.bam \
--genome ~/reference_fa/mm10/GRCm38.transcripts.fa \
--scale-events \
--signal-index \
--summary ko_summary.txt \
--threads 50 \
> ~/nanopore_methods/xpore/nanopolish_files/ko_eventalign.log \
2>&1
xpore processing:
# For mm_WT, it follows the same previous steps,
# here I just run one set for checking whether it could work.
xpore-dataprep --eventalign /data/fast5_data/mm_WT/wt_eventalign.txt \
--summary /data/fast5_data/mm_WT/wt_summary.txt \
--out_dir ~/nanopore_methods/xpore/wt/ \
--n_processes 4 --readcount_max 20000 > ~/nanopore_methods/xpore/wt/xpore_dataprep.log 2>&1
The output of nanopolish eventalign step are like:
(/home/rlwang/m6a) rlwang@dell-tower-server:/data/fast5_data/mm_WT$ head wt_eventalign.txt -n 3
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx
ENSMUST00000130201 548 TGTTA 20 t 5 104.38 2.869 0.00697 TGTTA 106.43 7.49 -0.23 16780 16801
ENSMUST00000130201 548 TGTTA 20 t 6 110.41 6.096 0.00963 TGTTA 106.43 7.49 0.44 16751 16780
(/home/rlwang/m6a) rlwang@dell-tower-server:/data/fast5_data/mm_WT$ head wt_summary.txt -n 3
read_index read_name fast5_path model_name strand num_events num_steps num_skips num_stays total_duration shift scale drift var
20 571a5dee-2649-41de-8bb3-c65aae7359f6 /data/fast5_data/mm_WT/single_fast5/all_fast5/571a5dee-2649-41de-8bb3-c65aae7359f6.fast5 template 453 226 4 222 2.61-2.967 0.903 0.000 1.423
37 0ce4f4e3-19aa-4aed-a8bf-86f3ec729fce /data/fast5_data/mm_WT/single_fast5/all_fast5/0ce4f4e3-19aa-4aed-a8bf-86f3ec729fce.fast5 template 1929 951 20 957 11.674.012 0.955 0.000 1.266
Then we I run the xpore dataprep processing cmd, I got the outputs like:
(/home/rlwang/m6a) rlwang@dell-tower-server:~/nanopore_methods/xpore/wt$ ls
eventalign.hdf5 eventalign.log xpore_dataprep.log
(/home/rlwang/m6a) rlwang@dell-tower-server:~/nanopore_methods/xpore/wt$ du -sh *
4.0K eventalign.hdf5
0 eventalign.log
36K xpore_dataprep.log
xpore_dataprep.log is like:
(base) rlwang@dell-tower-server:~/nanopore_methods/xpore/wt$ ls
eventalign.hdf5 eventalign.log xpore_dataprep.log
(base) rlwang@dell-tower-server:~/nanopore_methods/xpore/wt$ tail -f xpore_dataprep.log
obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
File "/home/rlwang/m6a/lib/python3.6/site-packages/pandas/core/indexing.py", line 1099, in _getitem_axis
return self._getitem_iterable(key, axis=axis)
File "/home/rlwang/m6a/lib/python3.6/site-packages/pandas/core/indexing.py", line 1037, in _getitem_iterable
keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
File "/home/rlwang/m6a/lib/python3.6/site-packages/pandas/core/indexing.py", line 1240, in _get_listlike_indexer
indexer, keyarr = ax._convert_listlike_indexer(key)
File "/home/rlwang/m6a/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2400, in _convert_listlike_indexer
raise KeyError(f"{keyarr[mask]} not in index")
KeyError: "['08be73d2-2dcc-4c45-b572-8ce3b807c2a1'] not in index"
but this record could be found in wt_summary.txt file:
(base) rlwang@dell-tower-server:/data/fast5_data/mm_WT$ grep '08be73d2-2dcc-4c45-b572-8ce3b807c2a1' wt_summary.txt
6 08be73d2-2dcc-4c45-b572-8ce3b807c2a1 /data/fast5_data/mm_WT/single_fast5/all_fast5/08be73d2-2dcc-4c45-b572-8ce3b807c2a1.fast5 template 2467 1262 25 1179 15.35 1.159 0.939 0.000 1.289
Hi @Seongmin-Jang-1165,
Sorry for the delayed reply! Just came back from vacation.
Can you update xpore, please? xpore-dataprep is deprecated.
Also, you will need to indicate the RNA004 kmer model when you get to the xpore diffmod step:
https://github.com/GoekeLab/xpore/blob/RNA004_kmer_model/xpore/diffmod/RNA004_5mer_model.txt
Thanks!
Best wishes, Yuk Kei
@yuukiiwa Sorry for late reply...
I'll attach the information that you requested
and
can you tell me how to indicate the RNA004 model when I run the xpore diffmod?? is there specific code for this??