mafTools icon indicating copy to clipboard operation
mafTools copied to clipboard

Error with mafToFastaStitcher

Open jkorstia opened this issue 6 years ago • 2 comments

Hello,

I have been trying to extract aligned fastas from a very large maf file (~64G) that contains 45 aligned whole genome sequences. When I try to run mafToFastaStitcher, I receive the following error message. Verbose: Creating sequence hash. (I cut out most of these, but there were 45, one for each sequence and it did report reading and finishing Anc11.fas) Verbose: Reading fasta Anc11.fas Verbose: Finished reading fasta Anc11.fas Verbose: Creating alignment hash. "Error, unable to locate sequnce Anc11 in the sequence hash. Check your input fasta files."

The unaligned sequence for Anc11 is in the same directory with all of the other sequences, and it does appear to be read by the program in the previous step. Is there a limit to the number of species this program can manage? Do you have any suggestions on how I can get past this issue?

Thanks, Jenny

jkorstia avatar Mar 09 '18 19:03 jkorstia

Hi Jenny, There's no hard limit to the size of input, the practical limit is from the amount of memory in your machine. In your input fasta files is there a sequence that's named "Anc11"? I see you have a file prefixed with that name, but is there a sequence named that?

On Fri, Mar 9, 2018 at 11:31 AM jkorstia [email protected] wrote:

Hello,

I have been trying to extract aligned fastas from a very large maf file (~64G) that contains 45 aligned whole genome sequences. When I try to run mafToFastaStitcher, I receive the following error message. Verbose: Creating sequence hash. (I cut out most of these, but there were 45, one for each sequence and it did report reading and finishing Anc11.fas) Verbose: Reading fasta Anc11.fas Verbose: Finished reading fasta Anc11.fas Verbose: Creating alignment hash. "Error, unable to locate sequnce Anc11 in the sequence hash. Check your input fasta files."

Is there a limit to the number of species this program can manage? Do you have any suggestions on how I can get past this issue?

Thanks, Jenny

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dentearl/mafTools/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AAkV8gIBb2cAXYMVljG_6-bgFqWMP4S4ks5tcth3gaJpZM4SkwUv .

dentearl avatar Mar 12 '18 13:03 dentearl

Hi,

Thanks for getting back to me so quickly!

No, the Anc11 file does not contain a fasta sequence named just Anc11. I just have a file named Anc11 that contains many scaffolds for Anc11, which are named “Anc11refChr2211” etc. Does the program require a that there be a single fasta file for each species? Is there a formatting I could apply to the scaffold names that might work? Perhaps “Anc11_refChr2211” or “Anc11 refChr211” might work?

Thanks for any ideas!

-Jenny

Sent from Mail for Windows 10

From: Dent Earl Sent: Monday, March 12, 2018 8:49 AM To: dentearl/mafTools Cc: jkorstia; Author Subject: Re: [dentearl/mafTools] Error with mafToFastaStitcher (#18)

Hi Jenny, There's no hard limit to the size of input, the practical limit is from the amount of memory in your machine. In your input fasta files is there a sequence that's named "Anc11"? I see you have a file prefixed with that name, but is there a sequence named that?

On Fri, Mar 9, 2018 at 11:31 AM jkorstia [email protected] wrote:

Hello,

I have been trying to extract aligned fastas from a very large maf file (~64G) that contains 45 aligned whole genome sequences. When I try to run mafToFastaStitcher, I receive the following error message. Verbose: Creating sequence hash. (I cut out most of these, but there were 45, one for each sequence and it did report reading and finishing Anc11.fas) Verbose: Reading fasta Anc11.fas Verbose: Finished reading fasta Anc11.fas Verbose: Creating alignment hash. "Error, unable to locate sequnce Anc11 in the sequence hash. Check your input fasta files."

Is there a limit to the number of species this program can manage? Do you have any suggestions on how I can get past this issue?

Thanks, Jenny

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dentearl/mafTools/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AAkV8gIBb2cAXYMVljG_6-bgFqWMP4S4ks5tcth3gaJpZM4SkwUv .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

jkorstia avatar Aug 14 '19 17:08 jkorstia