bioawk
bioawk copied to clipboard
bioawk trimming the protein sequences in the start
Hello,
I am trying to convert the .faa format protein sequences into OrthoMCL readable format (organism_ID|protein_ID) using the bioawk -c fastx '{ print ">GMI1000|"$name; print $seq }'. I am only getting the results with around 900 sequences out of 4000. I found that bioawk is not reading the sequences from first 3000 proteins in the .faa format. Is there any way to solve this problem?
Thank you very much in advance!!