xTea icon indicating copy to clipboard operation
xTea copied to clipboard

read 0 ALT contigs AND fail to open file "xxxx.clipped.fq"

Open kyle-lk opened this issue 2 years ago • 9 comments

image +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Command line :path-to-dir/xTea-master/bin/xtea -i normal_name.csv -b normal_name_bam.csv -x null -p ./normal_wes/5.TE_output/ -o ./normal_wes/5.TE_output/multiple_TE_output.sh -l path-to-dir/xTea-master/rep_lib_annotation/ -r ~/db/human_genome_index/hg38/hg38.fa -g path-to-dir/xTea-master/gencode.v33.hg38.annotation.gff3 --xtea path-to-dir/xTea-master/xtea/ -q short -f 1 -y 7 -n 8 -m 250 ____________________________________________________________________________________________________

I used xTea-v0.1 to analysis my blood-wes dataset. however, I always get above error.Why I always can't get a corrrect clipped.fq files.

kyle-lk avatar Aug 30 '22 18:08 kyle-lk

Could you change the relative path "-p ./normal_wes/5.TE_output/ " to the full path and have a try? Similarly for all other paths, use the full path.

simoncchu avatar Aug 30 '22 18:08 simoncchu

Thank you so much, simoncchu. I used full path in my commmand, and solved the above problem. however, i get new error:

image Blacklist file null does not exist! +++++++++++++++++++++ image +++++++++++++++++++++++++++++++ image

============================================ my command /home/liukai/biosoft/wes_script/xTea-master/bin/xtea2 -i /home/liukai/postd/wes/data/multiple_20_tumor_normal_name.csv -b /home/liukai/postd/wes/data/multiple_20_tumor_normal_name_bam.csv -x null -p /home/liukai/postd/wes/data/multiple_20_tumor_normal/5.TE_output/ -o /home/liukai/postd/wes/data/multiple_20_tumor_normal/5.TE_output/multiple_TE_output.sh -l /home/liukai/biosoft/wes_script/xTea-master/rep_lib_annotation/ -r ~/db/human_genome_index/hg38/hg38.fa -g /home/liukai/biosoft/wes_script/xTea-master/gencode.v33.hg38.annotation.gff3 --xtea /home/liukai/biosoft/wes_script/xTea-master/xtea/ -q short -f 5907 -y 7 -n 8 -m 25

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++ finally, i get the empty TE.vcf file

kyle-lk avatar Aug 31 '22 03:08 kyle-lk

$ python /home/liukai/biosoft/wes_script/xTea-master/xtea/x_TEA_main.py -D -i /home/liukai/postd/we s/data/HR_48sample_normal/5.TE_output/C180245-N/Alu/candidate_list_from_clip.txt --nd 5 --ref /home/liukai/db/human_genome_ind ex/hg38/hg38.fa -a /home/liukai/biosoft/wes_script/xTea-master/rep_lib_annotation/Alu/hg38/hg38_Alu.out -b /home/liukai/postd /wes/data/HR_48sample_normal/5.TE_output/C180245-N/Alu/bam_list.txt -p /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_out put/C180245-N/Alu/tmp/ -o /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C180245-N/Alu/candidate_list_from_disc.tx t -n 8 Working on "disc" step! Ave coverage is 24.828: automatic parameters (clip, disc, clip-disc) with value (2, 4 ,1)

Discordant cutoff: 4 is used!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! Position chr19:9803399 has more than 1 annotation!!!! multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/liukai/biosoft/anaconda3/envs/gatk4/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "/home/liukai/biosoft/anaconda3/envs/gatk4/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar return list(map(*args)) File "/home/liukai/biosoft/wes_script/xTea-master/xtea/x_TEI_locator.py", line 564, in unwrap_self_filter_by_discordant_non_barcode return TELocator.run_filter_by_discordant_pair_by_chrom_non_barcode(*arg, **kwarg) File "/home/liukai/biosoft/wes_script/xTea-master/xtea/x_TEI_locator.py", line 1111, in run_filter_by_discordant_pair_by_chrom_non_barcode site_pos + iextend, i_is, f_dev, xannotation) File "/home/liukai/biosoft/wes_script/xTea-master/xtea/x_alignments.py", line 99, in cnt_discordant_pairs iter_alignmts = bamfile.fetch(chrm, start, end) File "pysam/libcalignmentfile.pyx", line 1081, in pysam.libcalignmentfile.AlignmentFile.fetch File "pysam/libchtslib.pyx", line 689, in pysam.libchtslib.HTSFile.parse_region ValueError: invalid coordinates: start (19177217) > stop (19177216) """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/liukai/biosoft/wes_script/xTea-master/xtea/x_TEA_main.py", line 472, in sf_tmp, sf_raw_disc, b_tumor) File "/home/liukai/biosoft/wes_script/xTea-master/xtea/x_TEI_locator.py", line 421, in filter_candidate_sites_by_discordant_pairs_multi_alignmts m_sites, iext, i_is, f_dev, sf_annotation, tmp_cutoff) File "/home/liukai/biosoft/wes_script/xTea-master/xtea/x_TEI_locator.py", line 1175, in filter_candidate_sites_by_discordant_pairs_non_barcode pool.map(unwrap_self_filter_by_discordant_non_barcode, list(zip([self] * len(l_chrm_records), l_chrm_records)), 1) File "/home/liukai/biosoft/anaconda3/envs/gatk4/lib/python3.6/multiprocessing/pool.py", line 266, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/liukai/biosoft/anaconda3/envs/gatk4/lib/python3.6/multiprocessing/pool.py", line 644, in get raise self._value ValueError: invalid coordinates: start (19177217) > stop (19177216)

===================================================================

I run the problem step alone, and get above error report

kyle-lk avatar Aug 31 '22 04:08 kyle-lk

It's unclear to me which step triggered the error from what you posted here. Could you post the output files with size?

simoncchu avatar Aug 31 '22 15:08 simoncchu

It's unclear to me which step triggered the error from what you posted here. Could you post the output files with size?

. ├── [ 23] Alu │   ├── [ 111] bam_list1.txt │   ├── [ 110] bam_list.txt │   ├── [ 0] candidate_disc_filtered_cns2.txt │   ├── [ 0] candidate_disc_filtered_cns2.txt.high_confident │   ├── [ 0] candidate_disc_filtered_cns_post_filtering.txt │   ├── [ 0] candidate_disc_filtered_cns.txt │   ├── [ 0] candidate_disc_filtered_cns.txt.before_calling_transduction │   ├── [ 0] candidate_disc_filtered_cns.txt.before_calling_transduction.sites_cov │   ├── [ 0] candidate_disc_filtered_cns.txt.before_filtering │   ├── [ 0] candidate_disc_filtered_cns.txt.gntp.features │   ├── [ 0] candidate_disc_filtered_cns.txt.gntp.features0.out │   ├── [ 0] candidate_disc_filtered_cns.txt.high_confident │   ├── [ 459] candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene.txt.arff │   ├── [ 0] candidate_disc_filtered_cns.txt.igv_sites │   ├── [ 8.7M] candidate_list_from_clip.txt │   ├── [ 143M] candidate_list_from_clip.txt_tmp │   ├── [ 0] candidate_list_from_disc.txt │   ├── [ 0] candidate_list_from_disc.txt.clip_sites_raw_disc.txt │   ├── [ 4.1K] run_xTEA_pipeline.sh │   ├── [ 9] sample_id.txt │   └── [ 15] tmp │   ├── [ 122] basic_cov_is_rlth_info.txt │   ├── [ 428K] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.sam_sq.txt │   ├── [ 0] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.sam.std_out │   ├── [ 3] clip │   │   └── [ 26] 0 │   │   ├── [ 23K] chr10.clip_realign_pos │   │   ├── [ 32K] chr11.clip_realign_pos │   │   ├── [ 31K] chr12.clip_realign_pos │   │   ├── [ 12K] chr13.clip_realign_pos │   │   ├── [ 18K] chr14.clip_realign_pos │   │   ├── [ 17K] chr15.clip_realign_pos │   │   ├── [ 22K] chr16.clip_realign_pos │   │   ├── [ 29K] chr17.clip_realign_pos │   │   ├── [ 9.2K] chr18.clip_realign_pos │   │   ├── [ 40K] chr19.clip_realign_pos │   │   ├── [ 58K] chr1.clip_realign_pos │   │   ├── [ 14K] chr20.clip_realign_pos │   │   ├── [ 5.8K] chr21.clip_realign_pos │   │   ├── [ 11K] chr22.clip_realign_pos │   │   ├── [ 42K] chr2.clip_realign_pos │   │   ├── [ 33K] chr3.clip_realign_pos │   │   ├── [ 23K] chr4.clip_realign_pos │   │   ├── [ 27K] chr5.clip_realign_pos │   │   ├── [ 26K] chr6.clip_realign_pos │   │   ├── [ 29K] chr7.clip_realign_pos │   │   ├── [ 19K] chr8.clip_realign_pos │   │   ├── [ 23K] chr9.clip_realign_pos │   │   ├── [ 19K] chrX.clip_realign_pos │   │   └── [ 799] chrY.clip_realign_pos │   ├── [ 1.2M] clip_peak_candidate.list │   ├── [ 137M] clip_reads_tmp0 │   ├── [ 8] cns │   │   ├── [ 0] all_disc_pos.disc_pos │   │   ├── [ 2] asm │   │   ├── [ 146] basic_cov_is_rlth_info.txt │   │   ├── [ 829] filtering_log.txt │   │   ├── [ 1.3K] temp_clip.sam │   │   └── [ 1.3K] temp_disc.sam │   ├── [ 74] disc │   │   ├── [ 51K] chr10.discord_pos.txt │   │   ├── [ 21K] chr10.discord_pos.txt.discdt │   │   ├── [ 51K] chr10.discord_pos.txt.raw.discdt │   │   ├── [ 70K] chr11.discord_pos.txt │   │   ├── [ 29K] chr11.discord_pos.txt.discdt │   │   ├── [ 69K] chr11.discord_pos.txt.raw.discdt │   │   ├── [ 67K] chr12.discord_pos.txt │   │   ├── [ 28K] chr12.discord_pos.txt.discdt │   │   ├── [ 67K] chr12.discord_pos.txt.raw.discdt │   │   ├── [ 26K] chr13.discord_pos.txt │   │   ├── [ 11K] chr13.discord_pos.txt.discdt │   │   ├── [ 26K] chr13.discord_pos.txt.raw.discdt │   │   ├── [ 39K] chr14.discord_pos.txt │   │   ├── [ 16K] chr14.discord_pos.txt.discdt │   │   ├── [ 39K] chr14.discord_pos.txt.raw.discdt │   │   ├── [ 36K] chr15.discord_pos.txt │   │   ├── [ 15K] chr15.discord_pos.txt.discdt │   │   ├── [ 37K] chr15.discord_pos.txt.raw.discdt │   │   ├── [ 49K] chr16.discord_pos.txt │   │   ├── [ 20K] chr16.discord_pos.txt.discdt │   │   ├── [ 49K] chr16.discord_pos.txt.raw.discdt │   │   ├── [ 64K] chr17.discord_pos.txt │   │   ├── [ 26K] chr17.discord_pos.txt.discdt │   │   ├── [ 64K] chr17.discord_pos.txt.raw.discdt │   │   ├── [ 20K] chr18.discord_pos.txt │   │   ├── [ 8.3K] chr18.discord_pos.txt.discdt │   │   ├── [ 20K] chr18.discord_pos.txt.raw.discdt │   │   ├── [ 85K] chr19.discord_pos.txt │   │   ├── [ 34K] chr19.discord_pos.txt.discdt │   │   ├── [ 84K] chr19.discord_pos.txt.raw.discdt │   │   ├── [ 119K] chr1.discord_pos.txt │   │   ├── [ 52K] chr1.discord_pos.txt.discdt │   │   ├── [ 123K] chr1.discord_pos.txt.raw.discdt │   │   ├── [ 31K] chr20.discord_pos.txt │   │   ├── [ 13K] chr20.discord_pos.txt.discdt │   │   ├── [ 31K] chr20.discord_pos.txt.raw.discdt │   │   ├── [ 12K] chr21.discord_pos.txt │   │   ├── [ 5.1K] chr21.discord_pos.txt.discdt │   │   ├── [ 12K] chr21.discord_pos.txt.raw.discdt │   │   ├── [ 25K] chr22.discord_pos.txt │   │   ├── [ 10K] chr22.discord_pos.txt.discdt │   │   ├── [ 25K] chr22.discord_pos.txt.raw.discdt │   │   ├── [ 89K] chr2.discord_pos.txt │   │   ├── [ 39K] chr2.discord_pos.txt.discdt │   │   ├── [ 92K] chr2.discord_pos.txt.raw.discdt │   │   ├── [ 70K] chr3.discord_pos.txt │   │   ├── [ 30K] chr3.discord_pos.txt.discdt │   │   ├── [ 72K] chr3.discord_pos.txt.raw.discdt │   │   ├── [ 49K] chr4.discord_pos.txt │   │   ├── [ 21K] chr4.discord_pos.txt.discdt │   │   ├── [ 50K] chr4.discord_pos.txt.raw.discdt │   │   ├── [ 54K] chr5.discord_pos.txt │   │   ├── [ 24K] chr5.discord_pos.txt.discdt │   │   ├── [ 56K] chr5.discord_pos.txt.raw.discdt │   │   ├── [ 54K] chr6.discord_pos.txt │   │   ├── [ 23K] chr6.discord_pos.txt.discdt │   │   ├── [ 56K] chr6.discord_pos.txt.raw.discdt │   │   ├── [ 57K] chr7.discord_pos.txt │   │   ├── [ 25K] chr7.discord_pos.txt.discdt │   │   ├── [ 59K] chr7.discord_pos.txt.raw.discdt │   │   ├── [ 40K] chr8.discord_pos.txt │   │   ├── [ 17K] chr8.discord_pos.txt.discdt │   │   ├── [ 42K] chr8.discord_pos.txt.raw.discdt │   │   ├── [ 48K] chr9.discord_pos.txt │   │   ├── [ 21K] chr9.discord_pos.txt.discdt │   │   ├── [ 50K] chr9.discord_pos.txt.raw.discdt │   │   ├── [ 41K] chrX.discord_pos.txt │   │   ├── [ 18K] chrX.discord_pos.txt.discdt │   │   ├── [ 42K] chrX.discord_pos.txt.raw.discdt │   │   ├── [ 1.5K] chrY.discord_pos.txt │   │   ├── [ 662] chrY.discord_pos.txt.discdt │   │   └── [ 1.6K] chrY.discord_pos.txt.raw.discdt │   ├── [ 218] discordant_reads_tmp0 │   ├── [ 0] disc_tmp.list │   ├── [ 3] igv │   │   └── [ 0] bamsnap_screenshot.txt │   ├── [ 1.4M] raw_discordant_reads_tmp0 │   └── [ 3] transduction │   └── [ 121] basic_cov_is_rlth_info.txt ├── [ 934] Alu.config ├── [ 10] L1 │   ├── [ 111] bam_list1.txt │   ├── [ 110] bam_list.txt │   ├── [ 459] candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene.txt.arff │   ├── [ 7.9M] candidate_list_from_clip.txt │   ├── [ 143M] candidate_list_from_clip.txt_tmp │   ├── [ 4.2K] run_xTEA_pipeline.sh │   ├── [ 9] sample_id.txt │   └── [ 12] tmp │   ├── [ 114] basic_cov_is_rlth_info.txt │   ├── [ 14K] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.sam_sq.txt │   ├── [ 0] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.sam.std_out │   ├── [ 3] clip │   │   └── [ 51] 0 │   │   ├── [ 29M] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.fq │   │   ├── [ 3.4M] chr10.clip_pos │   │   ├── [ 21K] chr10.clip_realign_pos │   │   ├── [ 5.1M] chr11.clip_pos │   │   ├── [ 29K] chr11.clip_realign_pos │   │   ├── [ 4.7M] chr12.clip_pos │   │   ├── [ 28K] chr12.clip_realign_pos │   │   ├── [ 1.5M] chr13.clip_pos │   │   ├── [ 11K] chr13.clip_realign_pos │   │   ├── [ 2.7M] chr14.clip_pos │   │   ├── [ 16K] chr14.clip_realign_pos │   │   ├── [ 2.7M] chr15.clip_pos │   │   ├── [ 15K] chr15.clip_realign_pos │   │   ├── [ 3.4M] chr16.clip_pos │   │   ├── [ 20K] chr16.clip_realign_pos │   │   ├── [ 4.7M] chr17.clip_pos │   │   ├── [ 26K] chr17.clip_realign_pos │   │   ├── [ 1.3M] chr18.clip_pos │   │   ├── [ 8.4K] chr18.clip_realign_pos │   │   ├── [ 5.5M] chr19.clip_pos │   │   ├── [ 34K] chr19.clip_realign_pos │   │   ├── [ 8.9M] chr1.clip_pos │   │   ├── [ 51K] chr1.clip_realign_pos │   │   ├── [ 2.2M] chr20.clip_pos │   │   ├── [ 13K] chr20.clip_realign_pos │   │   ├── [ 819K] chr21.clip_pos │   │   ├── [ 5.3K] chr21.clip_realign_pos │   │   ├── [ 1.9M] chr22.clip_pos │   │   ├── [ 10K] chr22.clip_realign_pos │   │   ├── [ 6.4M] chr2.clip_pos │   │   ├── [ 38K] chr2.clip_realign_pos │   │   ├── [ 5.2M] chr3.clip_pos │   │   ├── [ 30K] chr3.clip_realign_pos │   │   ├── [ 3.3M] chr4.clip_pos │   │   ├── [ 21K] chr4.clip_realign_pos │   │   ├── [ 3.9M] chr5.clip_pos │   │   ├── [ 24K] chr5.clip_realign_pos │   │   ├── [ 3.7M] chr6.clip_pos │   │   ├── [ 24K] chr6.clip_realign_pos │   │   ├── [ 4.0M] chr7.clip_pos │   │   ├── [ 27K] chr7.clip_realign_pos │   │   ├── [ 2.8M] chr8.clip_pos │   │   ├── [ 18K] chr8.clip_realign_pos │   │   ├── [ 3.6M] chr9.clip_pos │   │   ├── [ 21K] chr9.clip_realign_pos │   │   ├── [ 2.9M] chrX.clip_pos │   │   ├── [ 18K] chrX.clip_realign_pos │   │   ├── [ 130K] chrY.clip_pos │   │   └── [ 688] chrY.clip_realign_pos │   ├── [ 1.1M] clip_peak_candidate.list │   ├── [ 137M] clip_reads_tmp0 │   ├── [ 7] cns │   │   ├── [ 2] asm │   │   ├── [ 113] basic_cov_is_rlth_info.txt │   │   ├── [ 0] candidate_sites_all_clip.fq │   │   ├── [ 0] candidate_sites_all_disc.fa │   │   └── [ 64] filtering_log.txt │   ├── [ 26] disc │   │   ├── [ 47K] chr10.discord_pos.txt │   │   ├── [ 64K] chr11.discord_pos.txt │   │   ├── [ 62K] chr12.discord_pos.txt │   │   ├── [ 24K] chr13.discord_pos.txt │   │   ├── [ 36K] chr14.discord_pos.txt │   │   ├── [ 34K] chr15.discord_pos.txt │   │   ├── [ 45K] chr16.discord_pos.txt │   │   ├── [ 59K] chr17.discord_pos.txt │   │   ├── [ 19K] chr18.discord_pos.txt │   │   ├── [ 76K] chr19.discord_pos.txt │   │   ├── [ 108K] chr1.discord_pos.txt │   │   ├── [ 29K] chr20.discord_pos.txt │   │   ├── [ 12K] chr21.discord_pos.txt │   │   ├── [ 23K] chr22.discord_pos.txt │   │   ├── [ 81K] chr2.discord_pos.txt │   │   ├── [ 65K] chr3.discord_pos.txt │   │   ├── [ 45K] chr4.discord_pos.txt │   │   ├── [ 50K] chr5.discord_pos.txt │   │   ├── [ 50K] chr6.discord_pos.txt │   │   ├── [ 53K] chr7.discord_pos.txt │   │   ├── [ 37K] chr8.discord_pos.txt │   │   ├── [ 45K] chr9.discord_pos.txt │   │   ├── [ 39K] chrX.discord_pos.txt │   │   └── [ 1.3K] chrY.discord_pos.txt │   ├── [ 2] igv │   └── [ 3] transduction │   └── [ 120] basic_cov_is_rlth_info.txt ├── [ 1.0K] L1.config ├── [ 3] pub_clip │   └── [ 27] 0 │   ├── [ 141] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.fq -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.fq │   ├── [ 97] chr10.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr10.clip_pos │   ├── [ 97] chr11.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr11.clip_pos │   ├── [ 97] chr12.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr12.clip_pos │   ├── [ 97] chr13.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr13.clip_pos │   ├── [ 97] chr14.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr14.clip_pos │   ├── [ 97] chr15.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr15.clip_pos │   ├── [ 97] chr16.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr16.clip_pos │   ├── [ 97] chr17.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr17.clip_pos │   ├── [ 97] chr18.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr18.clip_pos │   ├── [ 97] chr19.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr19.clip_pos │   ├── [ 96] chr1.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr1.clip_pos │   ├── [ 97] chr20.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr20.clip_pos │   ├── [ 97] chr21.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr21.clip_pos │   ├── [ 97] chr22.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr22.clip_pos │   ├── [ 96] chr2.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr2.clip_pos │   ├── [ 96] chr3.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr3.clip_pos │   ├── [ 96] chr4.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr4.clip_pos │   ├── [ 96] chr5.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr5.clip_pos │   ├── [ 96] chr6.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr6.clip_pos │   ├── [ 96] chr7.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr7.clip_pos │   ├── [ 96] chr8.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr8.clip_pos │   ├── [ 96] chr9.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chr9.clip_pos │   ├── [ 96] chrX.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chrX.clip_pos │   └── [ 96] chrY.clip_pos -> /home/liukai/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/L1/tmp/clip/0/chrY.clip_pos ├── [ 10] SVA │   ├── [ 111] bam_list1.txt │   ├── [ 110] bam_list.txt │   ├── [ 459] candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene.txt.arff │   ├── [ 9.5M] candidate_list_from_clip.txt │   ├── [ 143M] candidate_list_from_clip.txt_tmp │   ├── [ 4.2K] run_xTEA_pipeline.sh │   ├── [ 9] sample_id.txt │   └── [ 12] tmp │   ├── [ 113] basic_cov_is_rlth_info.txt │   ├── [ 121K] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.sam_sq.txt │   ├── [ 0] C150021-N.sorted.MarkDuplicates.BQSR.sorted.bam.clipped.sam.std_out │   ├── [ 3] clip │   │   └── [ 26] 0 │   │   ├── [ 25K] chr10.clip_realign_pos │   │   ├── [ 36K] chr11.clip_realign_pos │   │   ├── [ 34K] chr12.clip_realign_pos │   │   ├── [ 13K] chr13.clip_realign_pos │   │   ├── [ 20K] chr14.clip_realign_pos │   │   ├── [ 18K] chr15.clip_realign_pos │   │   ├── [ 25K] chr16.clip_realign_pos │   │   ├── [ 33K] chr17.clip_realign_pos │   │   ├── [ 10K] chr18.clip_realign_pos │   │   ├── [ 46K] chr19.clip_realign_pos │   │   ├── [ 64K] chr1.clip_realign_pos │   │   ├── [ 15K] chr20.clip_realign_pos │   │   ├── [ 6.3K] chr21.clip_realign_pos │   │   ├── [ 13K] chr22.clip_realign_pos │   │   ├── [ 46K] chr2.clip_realign_pos │   │   ├── [ 36K] chr3.clip_realign_pos │   │   ├── [ 26K] chr4.clip_realign_pos │   │   ├── [ 29K] chr5.clip_realign_pos │   │   ├── [ 28K] chr6.clip_realign_pos │   │   ├── [ 33K] chr7.clip_realign_pos │   │   ├── [ 21K] chr8.clip_realign_pos │   │   ├── [ 26K] chr9.clip_realign_pos │   │   ├── [ 21K] chrX.clip_realign_pos │   │   └── [ 883] chrY.clip_realign_pos │   ├── [ 1.3M] clip_peak_candidate.list │   ├── [ 137M] clip_reads_tmp0 │   ├── [ 7] cns │   │   ├── [ 2] asm │   │   ├── [ 113] basic_cov_is_rlth_info.txt │   │   ├── [ 0] candidate_sites_all_clip.fq │   │   ├── [ 0] candidate_sites_all_disc.fa │   │   └── [ 64] filtering_log.txt │   ├── [ 26] disc │   │   ├── [ 55K] chr10.discord_pos.txt │   │   ├── [ 76K] chr11.discord_pos.txt │   │   ├── [ 73K] chr12.discord_pos.txt │   │   ├── [ 28K] chr13.discord_pos.txt │   │   ├── [ 42K] chr14.discord_pos.txt │   │   ├── [ 40K] chr15.discord_pos.txt │   │   ├── [ 54K] chr16.discord_pos.txt │   │   ├── [ 70K] chr17.discord_pos.txt │   │   ├── [ 22K] chr18.discord_pos.txt │   │   ├── [ 93K] chr19.discord_pos.txt │   │   ├── [ 130K] chr1.discord_pos.txt │   │   ├── [ 34K] chr20.discord_pos.txt │   │   ├── [ 13K] chr21.discord_pos.txt │   │   ├── [ 27K] chr22.discord_pos.txt │   │   ├── [ 96K] chr2.discord_pos.txt │   │   ├── [ 75K] chr3.discord_pos.txt │   │   ├── [ 53K] chr4.discord_pos.txt │   │   ├── [ 58K] chr5.discord_pos.txt │   │   ├── [ 58K] chr6.discord_pos.txt │   │   ├── [ 62K] chr7.discord_pos.txt │   │   ├── [ 44K] chr8.discord_pos.txt │   │   ├── [ 53K] chr9.discord_pos.txt │   │   ├── [ 45K] chrX.discord_pos.txt │   │   └── [ 1.6K] chrY.discord_pos.txt │   ├── [ 2] igv │   └── [ 3] transduction │   └── [ 113] basic_cov_is_rlth_info.txt └── [ 1022] SVA.config

This is the output files of one samples

kyle-lk avatar Aug 31 '22 16:08 kyle-lk

Error triggered at the second step. Could you post the content of this file? run_xTEA_pipeline.sh

simoncchu avatar Aug 31 '22 22:08 simoncchu

Just noticed you have run that step. Not sure why trigger this error ValueError: invalid coordinates: start (19177217) > stop (19177216). Is this WES? Could you test on HG002 (WGS) and see whether you can go through?

simoncchu avatar Aug 31 '22 22:08 simoncchu

Just noticed you have run that step. Not sure why trigger this error ValueError: invalid coordinates: start (19177217) > stop (19177216). Is this WES? Could you test on HG002 (WGS) and see whether you can go through?

Yes, it;s a WES dataset.

run_xTea_pipeline.sh in directory: Alu. PREFIX=/home/liukxx/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/Alu/ ############ ############ ANNOTATION=/home/liukxx/biosoft/wes_script/xTea-master/rep_lib_annotation/Alu/hg38/hg38_Alu.out ANNOTATION1=/home/liukxx/biosoft/wes_script/xTea-master/rep_lib_annotation/Alu/hg38/hg38_Alu.out REF=/home/liukxx/db/human_genome_index/hg38/hg38.fa GENE=/home/liukxx/biosoft/wes_script/xTea-master/gencode.v33.hg38.annotation.gff3 BLACK_LIST=/home/liukxx/db/human_genome_index/hg38/S33266340_Regions.bed L1_COPY_WITH_FLANK=/home/liukxx/biosoft/wes_script/xTea-master/rep_lib_annotation/Alu/hg38/hg38_AluJabc_copies_with_flank.fa SF_FLANK=null L1_CNS=/home/liukxx/biosoft/wes_script/xTea-master/rep_lib_annotation/consensus/ALU.fa XTEA_PATH=/home/liukxx/biosoft/wes_script/xTea-master/xtea/ BAM_LIST=${PREFIX}"bam_list.txt" BAM1=${PREFIX}"10X_phased_possorted_bam.bam" BARCODE_BAM=${PREFIX}"10X_barcode_indexed.sorted.bam" TMP=${PREFIX}"tmp/" TMP_CLIP=${PREFIX}"tmp/clip/" TMP_CNS=${PREFIX}"tmp/cns/" TMP_TNSD=${PREFIX}"tmp/transduction/" ############ ############ python ${XTEA_PATH}"x_TEA_main.py" -C -i ${BAM_LIST} --lc 3 --rc 3 --cr 1 -r ${L1_COPY_WITH_FLANK} -a ${ANNOTATION} --cns ${L1_CNS} --ref ${REF} -p ${TMP} -o ${PREFIX}"candidate_list_from_clip.txt" -n 8 --cp /home/liukxx/postd/wes/data/HR_48sample_normal/5.TE_output/C150021-N/pub_clip/
python ${XTEA_PATH}"x_TEA_main.py" -D -i ${PREFIX}"candidate_list_from_clip.txt" --nd 5 --ref ${REF} -a ${ANNOTATION} -b ${BAM_LIST} -p ${TMP} -o ${PREFIX}"candidate_list_from_disc.txt" -n 8
python ${XTEA_PATH}"x_TEA_main.py" -N --cr 3 --nd 5 -b ${BAM_LIST} -p ${TMP_CNS} --fflank ${SF_FLANK} --flklen 3000 -n 8 -i ${PREFIX}"candidate_list_from_disc.txt" -r ${L1_CNS} --ref ${REF} -a ${ANNOTATION} -o ${PREFIX}"candidate_disc_filtered_cns.txt"
python ${XTEA_PATH}"x_TEA_main.py" --transduction --cr 3 --nd 5 -b ${BAM_LIST} -p ${TMP_TNSD} --fflank ${SF_FLANK} --flklen 3000 -n 8 -i ${PREFIX}"candidate_disc_filtered_cns.txt" -r ${L1_CNS} --ref ${REF} --input2 ${PREFIX}"candidate_list_from_disc.txt.clip_sites_raw_disc.txt" --rtype 2 -a ${ANNOTATION1} -o ${PREFIX}"candidate_disc_filtered_cns2.txt" python ${XTEA_PATH}"x_TEA_main.py" --sibling --cr 3 --nd 5 -b ${BAM_LIST} -p ${TMP_TNSD} --fflank ${SF_FLANK} --flklen 3000 -n 8 -i ${PREFIX}"candidate_disc_filtered_cns2.txt" -r ${L1_CNS} --ref ${REF} --input2 ${PREFIX}"candidate_list_from_disc.txt.clip_sites_raw_disc.txt" --rtype 2 -a ${ANNOTATION1} --blacklist ${BLACK_LIST} -o ${PREFIX}"candidate_sibling_transduction2.txt" python ${XTEA_PATH}"x_TEA_main.py" --postF --rtype 2 -p ${TMP_CNS} -n 8 -i ${PREFIX}"candidate_disc_filtered_cns2.txt" -a ${ANNOTATION1} -o ${PREFIX}"candidate_disc_filtered_cns_post_filtering.txt" python ${XTEA_PATH}"x_TEA_main.py" --postF --rtype 2 -p ${TMP_CNS} -n 8 -i ${PREFIX}"candidate_disc_filtered_cns2.txt.high_confident" -a ${ANNOTATION1} --blacklist ${BLACK_LIST} -o ${PREFIX}"candidate_disc_filtered_cns.txt.high_confident.post_filtering.txt" python ${XTEA_PATH}"x_TEA_main.py" --gene -a ${GENE} -i ${PREFIX}"candidate_disc_filtered_cns.txt.high_confident.post_filtering.txt" -n 8 -o ${PREFIX}"candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene.txt" python ${XTEA_PATH}"x_TEA_main.py" --gntp_classify -i ${PREFIX}"candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene.txt" -n 1 --model ${XTEA_PATH}"genotyping/DF21_model_1_2" -o ${PREFIX}"candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene_gntp.txt" python ${XTEA_PATH}"x_TEA_main.py" --gVCF -i ${PREFIX}"candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene_gntp.txt" -o ${PREFIX} -b ${BAM_LIST} --ref ${REF} --rtype 2 python ${XTEA_PATH}"x_TEA_main.py" --igv --single_sample -p ${PREFIX}"tmp/igv" -b ${PREFIX}"bam_list1.txt" -i ${PREFIX}"candidate_disc_filtered_cns.txt" --ref ${REF} -e 1000 -n 8 -o ${PREFIX}"tmp/igv/bamsnap_screenshot.txt" python ${XTEA_PATH}"x_TEA_main.py" --igv --single_sample -p ${PREFIX}"tmp/igv" -b ${PREFIX}"bam_list1.txt" -i ${PREFIX}"candidate_disc_filtered_cns.txt.high_confident.post_filtering.txt" --ref ${REF} -e 1000 -n 8 -o ${PREFIX}"tmp/igv/bamsnap_screenshot_hc.txt"

kyle-lk avatar Aug 31 '22 23:08 kyle-lk

I remember a similar issue was reported on WES before. xTea is mainly developed for WGS, and is not tested that much on WES. I will test more and see how to solve this.

simoncchu avatar Sep 01 '22 12:09 simoncchu

any updates on this issue?

GGBond178 avatar Sep 27 '23 09:09 GGBond178

any updates on this issue? It was an error during my data preprocessing that resulted in this issue. I had only utilized the 5'-end of the paired-end sequencing fastq files. After making the necessary adjustments, it is now running smoothly.

kyle-lk avatar Oct 05 '23 03:10 kyle-lk