psmc
psmc copied to clipboard
gen_raw_mask.pl not working correctly
The faidx for my largest chromosome in the original assembly is as follows:
Chr1 196345723 6 60 61
The faidx for rawMask_35.fa and mask_35_50.fa show the following:
Chr1 5609878 6 60 61
I'm not sure why the size of my chromosome is so drastically reduced and I ave confirmed each previous step finished without error.
The following are my commands:
psmc/utils/./splitfa Sept22Assembly.fasta 35 | split -l 20000000 cat xaa xab xac > xxaa bwa index -a bwtsw genome.fasta bwa aln -R 1000000 -O 3 -E 3 genome.fasta xxaa > xxaa.sai bwa samse genome.fasta xxaa.sai xxaa > aln-se.sam samtools sort -n aln-se.sam -o aln-se_sorted.sam gzip aln-se_sorted.sam gzip -dc aln-se_sorted.sam.gz | perl seqbility-20091110/gen_raw_mask.pl > rawMask_35.fa seqbility-20091110/./gen_mask -l 35 -r 0.5 rawMask_35.fa > mask_35_50.fa
We also tried the following with the same result:
psmc/utils/./splitfa genome.fasta 35 | split -l 20000000
#mappability_kmers.list contains the following xaa xab xac
for x in $(cat mappability_kmers.list); do bwa aln -R 1000000 -O 3 -E 3 -t 20 genome.fasta $x > ${x}.sai; done for x in $(cat mappability_kmers.list); do bwa samse genome.fasta ${x}.sai ${x} | samtools sort -o ${x}.bam; done samtools merge x.bam xaa.bam xab.bam xac.bam samtools sort -n x.bam -o x_sorted.bam samtools view -h x_sorted.bam | seqbility-20091110/gen_raw_mask.pl > raw_mask_35.fa seqbility-20091110/gen_mask -l 35 -r 0.5 raw_mask_35.fa > mask_35.fa
#head of mask_35.fa Chr1 5609912 15 60 61 Chr2 4270068 5703441 60 61 Chr3 3265589 10044692 60 61 Chr4 2067153 13364723 60 61 Chr5 1273941 15466344 60 61
Hi Nicolas - I know this is an old post, but wondering if you happened to work out the cause or fix for this issue? We're currently having the same problem. Cheers, Emily
Hi Emily,
I was not able to sort out the issue. But I plan to use the software again soon and will hopefully find a solution.
On Mon, Oct 23, 2023 at 9:35 PM Emily Roycroft @.***> wrote:
Hi Nicolas - I know this is an old post, but wondering if you happened to work out the cause or fix for this issue? We're currently having the same problem. Cheers, Emily
— Reply to this email directly, view it on GitHub https://github.com/lh3/psmc/issues/38#issuecomment-1776525459, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFB6337MC5CP5ATCZLAUSJLYA5AQBAVCNFSM5FMINPTKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCNZXGY2TENJUGU4Q . You are receiving this because you authored the thread.Message ID: @.***>
-- Best,
Nicolas Alexandre PhD Candidate, Integrative Biology Whiteman Lab University of California - Berkeley @.*** @.***>