salmon icon indicating copy to clipboard operation
salmon copied to clipboard

Error about Transcript * appears in the reference but did not appear in the BAM

Open YIGUIz opened this issue 1 year ago • 0 comments

Hi, I hope you're well. Here is my question:

[Bulk mode] Error: Transcript * appears in the reference but did not appear in the BAM I want to obtain the ONT data expression by alignment-based mode, The command: singularity exec ${code_path}/singularity_images/salmon:1.10.3--h6dccd9a_2 salmon quant \ --ont -p 16 -t ${ref_trans_fa} -l U -a ${LR_bam} -o ${output_tmp1}

I changed a lot of transcripts.fa file, but it still reports "Transcript * appears in the reference but did not appear in the BAM".

  1. Firstly, I used the transcripts.fa provided by the NCBI - GCF_002263795.3_ARS-UCD2.0_genomic.fna

  2. Secondly, I used gffread to obtain the transcripts.fa, But "Error: no valid ID found for GFF record". So I converted the gtf file (version2.2) by shell command as you recommended. the command:

singularity exec /public/home/b20223040336/Workspace/long_read_rna/02code/singularity_images/gffread:0.12.7--hdcf5f25_4 gffread -w GCF_002263795.3_ARS-UCD2.0_transcripts.fa -g GCF_002263795.3_ARS-UCD2.0_genomic.fna -w GCF_002263795.3_ARS-UCD2.0_genomic.gtf grep -P '\btranscript_id\s+"[^"]+"' GCF_002263795.3_ARS-UCD2.0_genomic.gtf > GCF_002263795.3_ARS-UCD2.0_genomic_fixed.gtf singularity exec /public/home/b20223040336/Workspace/long_read_rna/02code/singularity_images/gffread:0.12.7--hdcf5f25_4 gffread GCF_002263795.3_ARS-UCD2.0_genomic_fixed.gtf -g GCF_002263795.3_ARS-UCD2.0_genomic.fna -w GCF_002263795.3_ARS-UCD2.0_transcripts_gtf.fa

3.Finally, I used the gff3 files provided by NCBI to obtain the transcripts.fa, the command: GCF_002263795.3_ARS-UCD2.0_genomic.gff -g GCF_002263795.3_ARS-UCD2.0_genomic.fna -w GCF_002263795.3_ARS-UCD2.0_transcripts_gff.fa

YIGUIz avatar Jul 18 '24 13:07 YIGUIz