lyve-SET
lyve-SET copied to clipboard
Getting Assembly metrics
Hi Iskatz, I need to compute assembly metrics, including coverage. My input data files are single-end fastq and the genome assembly fasta . In the documentation, paired-end reads must be shuffled - so I did not shuffle my data In the command below (genome is the size of the genome)
lyveset_1.1.4f.sif run_assembly_readMetrics.pl se_read.fq.gz -e ' + genome.astype(str) + ' > se_read'_readMetrics.txt'
the coverage value isn't computed, it came out a dot (.) for all my samples, and I got a (yes) for the avg-quality File avgReadLength totalBases minReadLength maxReadLength avgQuality numReads PE? coverage readScore medianFragmentLength
se_read_readMetrics.txt 3 3396752 3396752 1 6125697 38 yes 1.00 1 .
I added the flag --singleend as you recommended me, but the command failed
'lyveset_1.1.4f.sif run_assembly_readMetrics.pl --singleend se_read.fq.gz -e ' + genome.astype(str) + ' > se_read'_readMetrics.txt'
Is there anything I should add to troubleshoot this? Did I use the wrong script with --singleend? (' ' ie code run in python through singularity container) Thanks, TJ
Hi TJ, you are correct that --singleend
is not a parameter for this. Are you saying that the columns are mismatched? Do you see it line up better if you run column -t se_read_readMetrics.txt
? And then I guess one more issue is that coverage is not computed. You are correct; that value is only determined for bam files since there is no assembly to compare the raw reads to.
Could you show the output of the column -t
command?
Hi Iskatz, Actually, the columns match but some of them have no values computed, such as the the coverage column. Here is the output of the column -t. First and second rows are column headers File avgReadLength totalBases minReadLength maxReadLength avgQuality numReads PE? coverage readScore medianFragmentLength LRtrimmedfastqs/sample_21547.fq.gz 1.00 1 Inf 0.00 1 yes 0.00 . . Thanks, TJ