trimal icon indicating copy to clipboard operation
trimal copied to clipboard

terminalonly issue

Open vagkaratzas opened this issue 9 months ago • 6 comments

Hello, I have this MSA file:

>1710581050_405_944
FMRITPCEACHGQRLKPESLAVTVAGK--NIYEMTSMSVKNLKTFVDQME--LTKQQHLIGDQILKEIRARVGFLNEVGLDYLSLSRATGTLSGGEAQRIRLATQIGSGLVGVAYILDEPSIGLHQRDNDKLLGALMNLRDLGNTLIVVEHDEDTMRAADYIVDIGPGAGSHGGQVVACGTAEEIMQN----PDSVTGAYLSGRIQIPVPKERRKPTG---FLTIKGARENNLKNIDVDIPLGVMTCITGVSGSGKSSLTNEILYKHLARDLNR--ARCIPGEH-DDILGLEQLDKVIDIDQSPIGRTPRSNPATYTGVFDMIRDLFAGTPDAKAKGYKKGRFSFNVKGGRCEACSGDGIIKIEMHFLPDVYVPCEVCEGKRYNRETLEVKYKGKNIYDVLDMTVEEALEFFKNVP-----SIERKIQTLYDVGLSYVKLGQPSTELSGGEAQRIKLATELSKR--STGKTIYILDEPTTGLHFADVHKLIEILRRLSDGGNTVVVIEHNLDVIKTADYIIDMGPEGGDGGGTVIAKGTPEEVAKVKGSYTGQYVKKYLKK
>1317524612_213_752
YMSNSHCPTCNGARLRKESLAVKVGDK--NINELTEMSIDKIKNYLNSLK--LNNKDKMISEQILKELNKRLQFLIDVGLEYLTLSRSAGTLSGGEAQRIRLATQIGSGLTGVLYILDEPSIGLHQRDNEKLIATLKKLRDLGNTVIVVEHDEDTMYAADQIIDIGPGPGVHGGKVIAQGTAEQIKQI----PESITGQYLSGKKQIPIPEKRRKSNG--RAIEVKGATQNNLKNINVKFPLGQFICVTGVSGSGKSTLVNDILYKALAKQING--SNEKPGAH-KEIIGIENIDKIINIDQSPIGRTPRSNPATYTGVFDNIRDIFAGTNEAKMRGYEKGRFSFNVEGGRCEACSGDGILRIEMHFLPDIYVPCEVCKGKRYNKETLEVKYKGKTIADVLDMTVEEALEFFKNIP-----RIKQKIQTLHDVGLGYIKLGQPSTTLSGGEAQRVKLATELSKK--PTGKTLYILDEPTTGLHIADVHKLVEILQRLVDTGNSIIVIEHNLDLIKTSDYIIDLGPEGGDKGGKIVAVGTPEQIARNEQSYTGQFLGKYL--
>1356476163/1_541
----KPCGKCNGARLKPEALAVKINGL--HISEVCEKSIGQAVEWFQNVEqfLSTNQKTIAERILKEIRERLGFLANVGLEYLSLDRSSGTLSGGESQRIRLASQIGSGLVGVLYVLDEPSIGLHPRDNQRLLDTLQHLSNLGNTVIVVEHDEEAMLSADMLVDLGPGAGRHGGYLVSFGTPQEVMAS----TKSITGQYLNGKRQIPVPTKRRNTKA-cPKLSINGASANNLKNLSAEIPLGALTCVTGVSGSGKSTFTVETLYQALSKHLNN--SRVVPGAY-KNIKGLEYLDKIIEIDQSPIGRTPRSNPATYTQAFTPIREWYASLPEAKARGYAPGRFSFNVKGGRCEACQGDGVIKIEMHFLPDVYVQCDVCKGKRYNRETLEVRFKNKSIADILEMTVEEGEEFFNAVP-----SIRDKLATLNRVGLGYLQIGQQATTLSGGEAQRVKLAKELSKR--ATGRTLYVLDEPTTGLHFEDIRKLLEVLHALVDTGNTVIIIEHNLDVIKTADHIIDLGPEGGDNGGELVASGTPEEILEVNESYTGRYLRPILK-
>1218837520_273_868/46_594
FFGNDLCESCNGARLRPEALHIFLPNTktNIVDLTKMSVSECIVFFEKLK--LNKKEQEIAKVVLKEITERLNFMNNVGIEYLTLDRRAHTLSGGESQRIRLASQLGSGLVGATYVLDEPTIGLHSHDNDRLIKTLLELRDAGNTIIVVEHDEDTIFSADYLVDIGPGAGVHGGNVVVADDLEKLLTAktnySKSLTLDYLRGEKNIEIPDSRRNENR--GKIQIKGGNIFNIKNMDVDFPLGKFVSVTGVSGSGKSSLVYEILYKNLRNKFDRryRTDVLENC-QSFSGSEYISRAILIDQSPIGRTPRSNPATYTGAFSFIRDLFSETSEARARGWKANRFSFNVKGGRCEACQGNGYIAVEMHFLPTVYVPCDICNGKRFMKETLEVKYKHKNIYEVLNMTVEEALKFFEDIP-----AIYDRLKTLDDVGLGYLELGQSATTLSGGEAQRVKISSELYRA--HLQKTIYILDEPTIGLHYEDVKKLIEILQKLVDKGNTVVVIEHNMDLIKSSDYIIDIGPGGGVNGGKIITKGTPEQVADSEKGFTSKYLKRVLK-
>2632102746_258_857/50_598
-FGSDKCEVCDGARLRPEALHVFLGKTkiNISDFVALSIQDAKDFIDSLK--LTKKEEDISGIVLKEINSRLQFMLNVGIEYLTLNRRANTLSGGEAQRIRLASQLGSGLVGALYVLDEPTIGLHQRDNDRLIQTLKDLRDTGNTIIIVEHDEDTIYASDYIVDIGPGAGVHGGEIVVSGDLEELLTAktnkSNSSTLSYLRGEKVIEIPERRKENDK--GELKIRGGKMFNIKNMNVDVPLRRMTTITGVSGSGKSTLLYELLYKNLRAKFDRryRTNKVYNC-ADFKGAEYLSRAILIDQSPIGRTPRSNPATYTGAFTHIRDLFATSSEARVRGWKPGRFSFNVKGGRCETCQGNGEIAIEMHFLPTVYTTCDVCKGKRFMKETLEVKYKSKNIHEVLKMTVEEALDFFIDIP-----TVYDRLKSLNDVGLGYLELGQSAKTLSGGEAQRVKISSELYRP--HTHKTIYLLDEPTVGLHYEDVKKLIEILKKLVDGGNTVVVIEHNMDLIKSSDYIIDIGPNGGEKGGNIVAKGTPEDVARNEDSYTGKYLKKVLR-
>595958172_397_939/1_541
YISLRDCPTCGGTRLKPESLAVRVGGM--NIDQVCRLSIRDCFDFFRTLP--LSHQDAVISERVLKEIRERLRFLLDVGMDYLNLARSSGTLSGGENQRIRLATQIGSGLMGVLYVLDEPTVGLHQRDNLRLIATLKRLRDMGNTVLVVEHDADMMLSSDQIIDMGPGAGREGGSIVFQGTPGEILRS----EDSLTGRYLSGALSIPIPRKRRPVRG--RWIVLEGAWENNLKNIDVRIPVGVFTAVTGVSGSGKSTMVIETLHKTLSRRLYR--HQGSAVKV-RRIVDLGGIERVILINQQPIGRTPRSNPATYTGVFNPIRDLFTGLPESRVRGYKPGRFSFNVRGGRCEACEGNGLIKIEMHFLPDVYVTCETCRGRRFNADTLDIRYKDQSIADVLDMTVNQAIDFFTNIP-----AIRSKLQLLMDVGLGYIRLGQSATTLSGGEAQRIKLSRELGKR--PESNTLYILDEPTIGLHFADIQKLLDVLMRLVDMGNTVVVIEHNLDVVKSADHIIDLGPEGGPGGGEIIAAGSPEDVARVAHSFTGQFLKDILK-
>1147647848_399_939
FLAYRKCSVCNGSRLKKEALHFRIGGK--NIAEVSAMSIAEFADWMEHIEehLSDKERKIAQEILKEIRERLRFLMDVGLGYLSLDRASRSLSGGESQRIRLATQIGSKLVNVLYILDEPSIGLHQRDNRKLIRSLEELRDAGNSVIVVEHDEDMMRAADYIVDVGPQAGRKGGNIVAAGKFDEILRA-----DSITADYLTGRRRIEIPAELRNSSG--DSIVIRGARGNNLKNVTAEFPLGKFICVTGVSGSGKSTLVNETLRPILSKTLYR--SFEQPLEY-DTIEGIEHIDKLVVVDQSPIGRTPRSNPATYSGVFADIRKLFEMTPDAQIRGFKAGRFSFNVKGGRCETCHGAGVETIEMNFLPDVYVRCRACGGHRYNRETLEVRYKGRNIDEVLNMTINAAVEFFENIP-----NIHQKLRAIQEVGLGYLTLGQPCTTLSGGESQRIKLAAELAKR--DTGRTLYILDEPTTGLHFEDIRLLLEVLNKLADRGNTVIVIEHNLDVIKVADHLIDLGPEGGAGGGEILVTGTPHEVANCPQSYTGQFLKQM---
>1273039597_416_960
FMTAKVCDTCFGARLKPESLAVTIREK--SIVDISDMSIEDCYEFMKNLSsnLAGKDLEIATVILKEINARLKFLFDVGLSYLTLSRSAESLAGGEAQRIRLASQIGSGLSGVLYVLDEPSIGLHPRDNTKLIGTLKHLRDIGNSVIVVEHDKEMMQESDYIFDFGPGAGEHGGHVIAHGTPKQIEVD----PKSITGKYLSGKAKIKIQKETLIEPEvqRYIELKGVREHNLKNVDVKFPLGRFVCITGVSGSGKSTLINDVFYHAIAHYKNL-fHKERPGKF-DTITGYDEIKRIFMIDQSPIGRTPRSNPATYTGAFTYIRDIYANSRDARLRGYGPGRFSFNVKGGRCENCEGEGQIKIEMQFLSDVYVKCEVCNGTRYTQDTLEINFEGKNISEVLNMSVEEAIPFFAFHE-----PLTARLTTLKEVGLSYIRLGQPATTLSGGEAQRIKLASELSKK--GGGGSLYVLDEPTTGLHFADLEKLILVLRRLVDKGNTVVVIEHNLDLIKNADYIIDLGPEGGDQGGKIIGVGTPKEISKIPTSYTGQYLKREI--
>2713204418_207_816/68_607
--TENECRACGGQRLNKQALAITVGGK--NIFQMGQLSVKHLLQFYAQLQ--LDESALAIGQGLIKEIVARLEFLHNVGLSYLTLNRSSRTLSGGEGQRIRLATQIGSALSGVLYVLDEPSIGLHQRDNDRLIATLHALRDQGNTVVVVEHDVDTMRQADYLIDMGPAAGVLGGRVTAVGMPHELATN----PASLTGGYLSGKHSIERTGAIRKPTG---NLILNHAKANNLRDITVSFPLGVVCGISGVSGSGKSTLIMQELVPTLSTLLSR--RNRIDEASdSRLSGAQAIENMVVIDQSPIGRTPRSNPATYLGIFDAIRTLFASLPESKARGYKSGRFSFNVAEGRCFECRGDGVIKVEMHFLPEVTMVCKACKGKRYDAQTLQITFKDKNIADILDMTALEAAQFFSAHA-----PIAKRLALLCEVGLDYLKLGQASTTLSGGEAQRIKLVDELSKR--GK-GTLYILDEPTTGLHNSDIERLLAVLNSLVDKGNSMIIIEHNIDVLKTVDHLIDLGPEGGDEGGMVVAQGTPQEVAHNAKSYTGQYLKRAL--
>1814953751_1_669/1_519
---------------------------------LWQMPVDELLEFFSGMS--APMGDKT-TTLLLDEIGSRLRYLIRVGLAYLDLDRPTRTLSGGEIQRVNLTTCLGASLVNTLFVLDEPSIGLHPRDTGRLIGVMEDLRDKGNTLVVVEHEEAVIRAADHLVDIGPGRGEGGGELIFSGAPAKIASK----KASLTGAYLTGRQAIAIPPKRRKPKR-gQQLKIVRASQNNLRDIDVTIPLGVFCCVTGVSGSGKSTLVHKVLYENLMRDAGE-tLDEEPGRC-KAIRGAEKLGQVVMVDQSPLSRTPRSTPGVYTGAFELIRKLFAATDDAKAAGLTMGYFSFNSGTGRCERCWGNGFEKIEMQFLSDIFVRCPECEGRRYGPDAMNYTLLGESIADVLDFTVGRAIEFFSQIDkklgrQVVDKLQVLAEVGLGYLRLGQPLNTLSGGESHRLKLVGHLLEGqgDASGDLLIFDEPTTGLHFDDIRMLLNVFQRLVDAGNSLLVIEHNLDVIKTADHVIDLGPEGGVGGGELVGTGTPEALARNAQSHTGRYLAALL--

As I understand, the -terminalonly flag only trims gaps at the ends of the MSA (correct?), and it works fine in most cases. However in some cases, just like the one above some in-sequence gaps are also trimmed (e.g. columns 28 and 29 here vanish). Is this a bug or have I misunderstood the use of -terminalonly? Example run command: trimal -in temp.sto -out test.sto -gt 0.5 -terminalonly Output:

>1710581050_405_944
FMRITPCEACHGQRLKPESLAVTVAGKNIYEMTSMSVKNLKTFVDQME--LTKQQHLIGD
QILKEIRARVGFLNEVGLDYLSLSRATGTLSGGEAQRIRLATQIGSGLVGVAYILDEPSI
GLHQRDNDKLLGALMNLRDLGNTLIVVEHDEDTMRAADYIVDIGPGAGSHGGQVVACGTA
EEIMQN----PDSVTGAYLSGRIQIPVPKERRKPTG---FLTIKGARENNLKNIDVDIPL
GVMTCITGVSGSGKSSLTNEILYKHLARDLNR--ARCIPGEH-DDILGLEQLDKVIDIDQ
SPIGRTPRSNPATYTGVFDMIRDLFAGTPDAKAKGYKKGRFSFNVKGGRCEACSGDGIIK
IEMHFLPDVYVPCEVCEGKRYNRETLEVKYKGKNIYDVLDMTVEEALEFFKNVP-----S
IERKIQTLYDVGLSYVKLGQPSTELSGGEAQRIKLATELSKR--STGKTIYILDEPTTGL
HFADVHKLIEILRRLSDGGNTVVVIEHNLDVIKTADYIIDMGPEGGDGGGTVIAKGTPEE
VAKVKGSYTGQYVKKYLK
>1317524612_213_752
YMSNSHCPTCNGARLRKESLAVKVGDKNINELTEMSIDKIKNYLNSLK--LNNKDKMISE
QILKELNKRLQFLIDVGLEYLTLSRSAGTLSGGEAQRIRLATQIGSGLTGVLYILDEPSI
GLHQRDNEKLIATLKKLRDLGNTVIVVEHDEDTMYAADQIIDIGPGPGVHGGKVIAQGTA
EQIKQI----PESITGQYLSGKKQIPIPEKRRKSNG--RAIEVKGATQNNLKNINVKFPL
GQFICVTGVSGSGKSTLVNDILYKALAKQING--SNEKPGAH-KEIIGIENIDKIINIDQ
SPIGRTPRSNPATYTGVFDNIRDIFAGTNEAKMRGYEKGRFSFNVEGGRCEACSGDGILR
IEMHFLPDIYVPCEVCKGKRYNKETLEVKYKGKTIADVLDMTVEEALEFFKNIP-----R
IKQKIQTLHDVGLGYIKLGQPSTTLSGGEAQRVKLATELSKK--PTGKTLYILDEPTTGL
HIADVHKLVEILQRLVDTGNSIIVIEHNLDLIKTSDYIIDLGPEGGDKGGKIVAVGTPEQ
IARNEQSYTGQFLGKYL-
>1356476163/1_541
----KPCGKCNGARLKPEALAVKINGLHISEVCEKSIGQAVEWFQNVEqfLSTNQKTIAE
RILKEIRERLGFLANVGLEYLSLDRSSGTLSGGESQRIRLASQIGSGLVGVLYVLDEPSI
GLHPRDNQRLLDTLQHLSNLGNTVIVVEHDEEAMLSADMLVDLGPGAGRHGGYLVSFGTP
QEVMAS----TKSITGQYLNGKRQIPVPTKRRNTKA-cPKLSINGASANNLKNLSAEIPL
GALTCVTGVSGSGKSTFTVETLYQALSKHLNN--SRVVPGAY-KNIKGLEYLDKIIEIDQ
SPIGRTPRSNPATYTQAFTPIREWYASLPEAKARGYAPGRFSFNVKGGRCEACQGDGVIK
IEMHFLPDVYVQCDVCKGKRYNRETLEVRFKNKSIADILEMTVEEGEEFFNAVP-----S
IRDKLATLNRVGLGYLQIGQQATTLSGGEAQRVKLAKELSKR--ATGRTLYVLDEPTTGL
HFEDIRKLLEVLHALVDTGNTVIIIEHNLDVIKTADHIIDLGPEGGDNGGELVASGTPEE
ILEVNESYTGRYLRPILK
>1218837520_273_868/46_594
FFGNDLCESCNGARLRPEALHIFLPNTNIVDLTKMSVSECIVFFEKLK--LNKKEQEIAK
VVLKEITERLNFMNNVGIEYLTLDRRAHTLSGGESQRIRLASQLGSGLVGATYVLDEPTI
GLHSHDNDRLIKTLLELRDAGNTIIVVEHDEDTIFSADYLVDIGPGAGVHGGNVVVADDL
EKLLTAktnySKSLTLDYLRGEKNIEIPDSRRNENR--GKIQIKGGNIFNIKNMDVDFPL
GKFVSVTGVSGSGKSSLVYEILYKNLRNKFDRryRTDVLENC-QSFSGSEYISRAILIDQ
SPIGRTPRSNPATYTGAFSFIRDLFSETSEARARGWKANRFSFNVKGGRCEACQGNGYIA
VEMHFLPTVYVPCDICNGKRFMKETLEVKYKHKNIYEVLNMTVEEALKFFEDIP-----A
IYDRLKTLDDVGLGYLELGQSATTLSGGEAQRVKISSELYRA--HLQKTIYILDEPTIGL
HYEDVKKLIEILQKLVDKGNTVVVIEHNMDLIKSSDYIIDIGPGGGVNGGKIITKGTPEQ
VADSEKGFTSKYLKRVLK
>2632102746_258_857/50_598
-FGSDKCEVCDGARLRPEALHVFLGKTNISDFVALSIQDAKDFIDSLK--LTKKEEDISG
IVLKEINSRLQFMLNVGIEYLTLNRRANTLSGGEAQRIRLASQLGSGLVGALYVLDEPTI
GLHQRDNDRLIQTLKDLRDTGNTIIIVEHDEDTIYASDYIVDIGPGAGVHGGEIVVSGDL
EELLTAktnkSNSSTLSYLRGEKVIEIPERRKENDK--GELKIRGGKMFNIKNMNVDVPL
RRMTTITGVSGSGKSTLLYELLYKNLRAKFDRryRTNKVYNC-ADFKGAEYLSRAILIDQ
SPIGRTPRSNPATYTGAFTHIRDLFATSSEARVRGWKPGRFSFNVKGGRCETCQGNGEIA
IEMHFLPTVYTTCDVCKGKRFMKETLEVKYKSKNIHEVLKMTVEEALDFFIDIP-----T
VYDRLKSLNDVGLGYLELGQSAKTLSGGEAQRVKISSELYRP--HTHKTIYLLDEPTVGL
HYEDVKKLIEILKKLVDGGNTVVVIEHNMDLIKSSDYIIDIGPNGGEKGGNIVAKGTPED
VARNEDSYTGKYLKKVLR
>595958172_397_939/1_541
YISLRDCPTCGGTRLKPESLAVRVGGMNIDQVCRLSIRDCFDFFRTLP--LSHQDAVISE
RVLKEIRERLRFLLDVGMDYLNLARSSGTLSGGENQRIRLATQIGSGLMGVLYVLDEPTV
GLHQRDNLRLIATLKRLRDMGNTVLVVEHDADMMLSSDQIIDMGPGAGREGGSIVFQGTP
GEILRS----EDSLTGRYLSGALSIPIPRKRRPVRG--RWIVLEGAWENNLKNIDVRIPV
GVFTAVTGVSGSGKSTMVIETLHKTLSRRLYR--HQGSAVKV-RRIVDLGGIERVILINQ
QPIGRTPRSNPATYTGVFNPIRDLFTGLPESRVRGYKPGRFSFNVRGGRCEACEGNGLIK
IEMHFLPDVYVTCETCRGRRFNADTLDIRYKDQSIADVLDMTVNQAIDFFTNIP-----A
IRSKLQLLMDVGLGYIRLGQSATTLSGGEAQRIKLSRELGKR--PESNTLYILDEPTIGL
HFADIQKLLDVLMRLVDMGNTVVVIEHNLDVVKSADHIIDLGPEGGPGGGEIIAAGSPED
VARVAHSFTGQFLKDILK
>1147647848_399_939
FLAYRKCSVCNGSRLKKEALHFRIGGKNIAEVSAMSIAEFADWMEHIEehLSDKERKIAQ
EILKEIRERLRFLMDVGLGYLSLDRASRSLSGGESQRIRLATQIGSKLVNVLYILDEPSI
GLHQRDNRKLIRSLEELRDAGNSVIVVEHDEDMMRAADYIVDVGPQAGRKGGNIVAAGKF
DEILRA-----DSITADYLTGRRRIEIPAELRNSSG--DSIVIRGARGNNLKNVTAEFPL
GKFICVTGVSGSGKSTLVNETLRPILSKTLYR--SFEQPLEY-DTIEGIEHIDKLVVVDQ
SPIGRTPRSNPATYSGVFADIRKLFEMTPDAQIRGFKAGRFSFNVKGGRCETCHGAGVET
IEMNFLPDVYVRCRACGGHRYNRETLEVRYKGRNIDEVLNMTINAAVEFFENIP-----N
IHQKLRAIQEVGLGYLTLGQPCTTLSGGESQRIKLAAELAKR--DTGRTLYILDEPTTGL
HFEDIRLLLEVLNKLADRGNTVIVIEHNLDVIKVADHLIDLGPEGGAGGGEILVTGTPHE
VANCPQSYTGQFLKQM--
>1273039597_416_960
FMTAKVCDTCFGARLKPESLAVTIREKSIVDISDMSIEDCYEFMKNLSsnLAGKDLEIAT
VILKEINARLKFLFDVGLSYLTLSRSAESLAGGEAQRIRLASQIGSGLSGVLYVLDEPSI
GLHPRDNTKLIGTLKHLRDIGNSVIVVEHDKEMMQESDYIFDFGPGAGEHGGHVIAHGTP
KQIEVD----PKSITGKYLSGKAKIKIQKETLIEPEvqRYIELKGVREHNLKNVDVKFPL
GRFVCITGVSGSGKSTLINDVFYHAIAHYKNL-fHKERPGKF-DTITGYDEIKRIFMIDQ
SPIGRTPRSNPATYTGAFTYIRDIYANSRDARLRGYGPGRFSFNVKGGRCENCEGEGQIK
IEMQFLSDVYVKCEVCNGTRYTQDTLEINFEGKNISEVLNMSVEEAIPFFAFHE-----P
LTARLTTLKEVGLSYIRLGQPATTLSGGEAQRIKLASELSKK--GGGGSLYVLDEPTTGL
HFADLEKLILVLRRLVDKGNTVVVIEHNLDLIKNADYIIDLGPEGGDQGGKIIGVGTPKE
ISKIPTSYTGQYLKREI-
>2713204418_207_816/68_607
--TENECRACGGQRLNKQALAITVGGKNIFQMGQLSVKHLLQFYAQLQ--LDESALAIGQ
GLIKEIVARLEFLHNVGLSYLTLNRSSRTLSGGEGQRIRLATQIGSALSGVLYVLDEPSI
GLHQRDNDRLIATLHALRDQGNTVVVVEHDVDTMRQADYLIDMGPAAGVLGGRVTAVGMP
HELATN----PASLTGGYLSGKHSIERTGAIRKPTG---NLILNHAKANNLRDITVSFPL
GVVCGISGVSGSGKSTLIMQELVPTLSTLLSR--RNRIDEASdSRLSGAQAIENMVVIDQ
SPIGRTPRSNPATYLGIFDAIRTLFASLPESKARGYKSGRFSFNVAEGRCFECRGDGVIK
VEMHFLPEVTMVCKACKGKRYDAQTLQITFKDKNIADILDMTALEAAQFFSAHA-----P
IAKRLALLCEVGLDYLKLGQASTTLSGGEAQRIKLVDELSKR--GK-GTLYILDEPTTGL
HNSDIERLLAVLNSLVDKGNSMIIIEHNIDVLKTVDHLIDLGPEGGDEGGMVVAQGTPQE
VAHNAKSYTGQYLKRAL-
>1814953751_1_669/1_519
-------------------------------LWQMPVDELLEFFSGMS--APMGDKT-TT
LLLDEIGSRLRYLIRVGLAYLDLDRPTRTLSGGEIQRVNLTTCLGASLVNTLFVLDEPSI
GLHPRDTGRLIGVMEDLRDKGNTLVVVEHEEAVIRAADHLVDIGPGRGEGGGELIFSGAP
AKIASK----KASLTGAYLTGRQAIAIPPKRRKPKR-gQQLKIVRASQNNLRDIDVTIPL
GVFCCVTGVSGSGKSTLVHKVLYENLMRDAGE-tLDEEPGRC-KAIRGAEKLGQVVMVDQ
SPLSRTPRSTPGVYTGAFELIRKLFAATDDAKAAGLTMGYFSFNSGTGRCERCWGNGFEK
IEMQFLSDIFVRCPECEGRRYGPDAMNYTLLGESIADVLDFTVGRAIEFFSQIDkklgrQ
VVDKLQVLAEVGLGYLRLGQPLNTLSGGESHRLKLVGHLLEGqgDASGDLLIFDEPTTGL
HFDDIRMLLNVFQRLVDAGNSLLVIEHNLDVIKTADHVIDLGPEGGVGGGELVGTGTPEA
LARNAQSHTGRYLAALL-

vagkaratzas avatar Apr 01 '25 11:04 vagkaratzas

Thanks for raising this aspect - I just managed to reproduce your example and added an additional parameter -htmlout issue_110.html so I can visually inspect the output file.

Looking at the image, you will see that column 34 is the first one without any gap. That column marks the "frontier" between the inner and outer part of the alignment on the left side. Therefore, any column selected to be removed before that column will be effectively removed. In contrast, any column after that one, and before the last one (column 568), without any gap, will be kept.

What do you think? Is the documentation not clear enough? Would you like to propose a way to make it better?

Image

scapella avatar Apr 01 '25 15:04 scapella

Damn, this never registered internal boundaries (first and last column without gaps). The explanation is clear enough, I just thought it worked a but differently, like CLipKIT's --ends_only flag. https://jlsteenwyk.com/ClipKIT/advanced/index.html#ends-only Would be nice if there was a similar function in trimalas well :)

vagkaratzas avatar Apr 01 '25 16:04 vagkaratzas

Thanks for validating the answer. I'd argue that this functionality existed before CLipKIT was first published.

Still, I'd like to give it a go and implement something similar to "ends-only." However, I don't understand the inherent rules for deciding whether a column should stay or not in the resulting alignment. Please provide more information.

Cheers,

S

scapella avatar Apr 01 '25 17:04 scapella

Thank you!

Given a gap threshold (either using a default value or -gapthreshold), the 'ends-onlyfunction should trim columns from the start of the MSA only until the first column that is below that gap threshold. Similarly for the end of the MSA, but in reverse. As an example, for an MSA with length 20, if columns [1,2,3,5,7,9,15,20] are above the gap threshold and are flagged to be trimmed, with theends-only` flag on, only columns 1,2,3 and 20 should be trimmed (because columns 4 and 19 are below the threshold). Another example: Columns above the gap threshold [2,3,4,19] -> Nothing gets trimmed because columns 1 and 20 are below the gap threshold.

Does this make sense?

vagkaratzas avatar Apr 02 '25 05:04 vagkaratzas

I achieve this with colnumbering and using the first and the last columns with selectcols afterwards:

readarray -t COLS < <(trimal -in {input} -automated1 -out /dev/null -colnumbering | grep -o '[[:digit:]]*')
trimal -selectcols { ${COLS[0]}-${COLS[-1]} } -complementary -in {input} -out {output}

alephreish avatar Apr 27 '25 13:04 alephreish

@nicodr97 shall we try to incorporate it natively into the current codebase? It seems doable, considering how terminalonly works

scapella avatar Jun 07 '25 18:06 scapella

-terminalonly has been implemented so, this is can be closed. Thank you!

vagkaratzas avatar Oct 13 '25 08:10 vagkaratzas