taxondna
taxondna copied to clipboard
Most of my sequences are not added
Hello, I want to add sequence data for 370 plant species using matK and rbcL genes, but it only adds 5 species. The window says "Some sequences in the taxonset CES_matK norm nimed weren't added. These are: Gymnocarpium dryopteris: It is too short (945 bp, while the column is supposed to be 1206 bp) Matteuccia struthiopteris: It is too short (762 bp, while the column is supposed to be 1206 bp) Dryopteris filix-mas: It is too short (615 bp, while the column is supposed to be 1206 bp) Athyrium filix-femina: It is too short (999 bp, while the column is supposed to be 1206 bp) Dryopteris carthusiana: It is too short (963 bp, while the column is supposed to be 1206 bp) Asplenium trichomanes: It is too short (978 bp, while the column is supposed to be 1206 bp) Cystopteris sudetica: It is too short (930 bp, while the column is supposed to be 1206 bp) Thelypteris palustris: It is too short (834 bp, while the column is supposed to be 1206 bp) Botrychium lunaria: It is too long (1503 bp, while the column is supposed to be 1206 bp) Ophioglossum vulgatum: It is too short (777 bp, while the column is supposed to be 1206 bp) Equisetum arvense: It is too long (1431 bp, while the column is supposed to be 1206 bp) etc"
What should I do? Thank you!
Hey there! Sequence Matrix expects aligned sequences, and so won't let you import unaligned sequences, since it doesn't know how to pad them. Why are you trying to import unaligned sequences? One quick fix would be to align them using a quick-and-dirty aligner, but I'm not sure if that's what you're trying to do.
I'm having the same issues - with aligned sequences. How is sequence matrix deciding the proper length of a sequence? When I check the sequence length in bash, they all look the same. Thanks!
@jpiaskowski Strange! It should be reading the length of each sequence separately. Could you please e-mail me a file the SequenceMatrix isn't opening? My e-mail address is gaurav[at]ggvaidya[dot]com.