xpore icon indicating copy to clipboard operation
xpore copied to clipboard

Question about datarep

Open cent0134 opened this issue 1 year ago • 2 comments

Hello,after using xpore dataprep --eventalign eventalign.txt --out_dir dataprep, it output a datarep file, but it is only 8m in size. When I open data.log, it shows: Total 1 transcripts - SUCCESSFULL FINISHED -, and when I open the data.readcount,it just show : idx,n_reads AF003887,1000 but my eventalign.exe is 27.9g in size,Is this normal?

This is the code I used to generate eventalign. txtnanopolish eventalign --reads Raw.fastq --bam sorted.bam --genome NL4-3.fa --signal-index --scale-events --summary sequencing_summary_FAT78923_e5e1ae12.txt --threads 32 > eventalign.txt [post-run summary] total reads: 152095, unparseable: 0, qc fail: 1994, could not calibrate: 1801, no alignment: 404, bad fast5: 0

cent0134 avatar Sep 12 '24 09:09 cent0134

contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx AF003887 1 GGAAG 18 t 218 120.95 2.280 0.00398 GGAAG 115.76 5.56 0.80 27139 27151 AF003887 2 GAAGG 18 t 219 101.72 4.898 0.02955 GAAGG 105.26 4.06 -0.75 27050 27139 AF003887 2 GAAGG 18 t 220 100.80 4.249 0.01228 GAAGG 105.26 4.06 -0.94 27013 27050 AF003887 2 GAAGG 18 t 221 103.57 4.451 0.00730 GAAGG 105.26 4.06 -0.36 26991 27013 AF003887 3 AAGGG 18 t 222 118.43 3.342 0.00266 AAGGG 113.12 7.84 0.58 26983 26991 AF003887 3 AAGGG 18 t 223 127.71 7.818 0.01029 AAGGG 113.12 7.84 1.60 26952 26983 AF003887 4 AGGGC 18 t 224 116.41 8.242 0.01096 AGGGC 116.40 4.05 0.00 26919 26952 AF003887 5 GGGCT 18 t 225 113.35 4.052 0.01693 GGGCT 113.28 5.31 0.01 26868 26919 this is part of my eventalign.txt

cent0134 avatar Sep 12 '24 09:09 cent0134

Hi @cent0134,

The reason why you get one transcript is that you only have AF00388 in the contig column of the eventalign.txt file. xpore dataprep default caps at 1000 reads per transcript, so if you want to include all the reads for AF00388, you can pass --readcount_max 152095 when you run xpore dataprep.

Thanks!

Best wishes, Yuk Kei

yuukiiwa avatar Oct 01 '24 02:10 yuukiiwa

您好 ,

您获得一个成绩单的原因是您仅在文件的 contig 列中。 默认上限为每个转录本 1000 次读取,因此,如果要包括 的所有读取,则可以在运行时传递。AF00388``eventalign.txt``xpore dataprep``AF00388``--readcount_max 152095``xpore dataprep

谢谢!

最美好的祝愿,Yuk Kei

Sorry I'm so late in replying, your advice is really useful!

cent0134 avatar Dec 01 '24 15:12 cent0134