ColabFold
ColabFold copied to clipboard
Questions about using the MSA part alone
Thank you for sharing the ColabFold repo, it helps a lot.
Are there any relevant APIs or methods that support the extraction of msa from protein sequences alone without structure prediction?
To use the MSA api, I ran the run_mmseqs2() alone, and got :
>101
MMTRRKKRSCSNQKKEEEKSERIPFDLVIEILLRLPVKSIARFRYVSKLWQSTLRGQHFTESYLTISSSRPKILFTCLKDCETFFFSSPHPQDLSPIAANLHMSFPISCPSNICRPVRGWLCGLHQRTTKGTTVTEPLICNPSTGESVVLRKVKTRRKGVISFLGFDPIDKNFKVLCMTRSCIGRADSEEHQVHTLETGKKPSRKMIECDILHYPVPVEHTNGFSQYDGVCINGVLYYLAIVHGVSDDRYPDVVCFEFGSDKFKYIKKVAGHDMEILYLGRRLNSILVNYKGKLAKLQPNMPNNVCTGIQLWVLEDAEKHEWSSHIYVLPPPWRNVYEETKLCFVGTTRKGEIVLSPNTISDFFYLLYYNPDRNTITIVKIKGMETFQSHKAYTFLDHLEDVNLVPIWRM
>A0A654FJR0 459 0.992 2.597E-140 0 409 410 0 409 410
MMTRRKKRSCSNQKKEEEKSERIPFDLVIEILLRLPVKSIARFRYVSKLWQSTLRGQHFTESYLTISSSRPKILFTCLKDCETFFFSSPHPQDLSPIAANLHMSFPMSCPSNICRPVRGWVCGLHQRTTKGTTVTEPLICNPSTGESVVLRKVKTRRKGVISFLGFDPIDKNFKVLCMTRSCIGRADSEEHQVHTLETGKKPSRKMIECDILHYPVPVEHTNGFSQYDGVCINGVLYYLAIVHGVSDDRYPDVVCFEFGSDKFKYIKKVAGHDMEILYLGRRLNSILVNYKGKLAKLQPNMPNNVCTGIQLWVLEDAEKHEWSSHIYVLPPPWRNVYEETKLCFVGTTRKGEIVLSPNTISDFFYLLYYNPDRNTITIVKIKGMETFQIHKAYTFLDHLEDVNLVPIWRM
>UPI000A29B610 417 0.881 1.390E-125 0 403 410 0 404 414
MMTRRKKRSCSNPKKEEvVKSEPIPFDLVIEILLRLPAKSIARFRYVSKLWQSTLRGPHFTESFLTLSSSRPKILFTCLKDGETFFFSSPHTQDLSPISANIHMSFPVNCPSNICRPVHGWVCGSHQRTTKGTTVTVPLICNPSTGESLALCKVKTRRKGVISFLGFDPIDKKFKVLCMTRAYVGRADSEEHQVLTLETGKKPSRKMIECDILHYPTVVEHTNGFSQYDGVCINGVLYYLAIVHGVSDHRYPDVVCFEFRSDKFNYIKKVAGPGME-MYLRGQLDSTLVNYKGKLAKLQPNMSNNgVCTGIQLWVLEDAEKHEWSSHIYVLPPPWRNVYEETKLCFVGTTRKGEIVLSPNTISNFFYLLYYNPERNTITIVKIKGLETFKSHKAYTFVDHLEDVKL------
>D7LS85 414 0.881 1.251E-124 1 403 410 0 403 714
-MTRRKKRSCSNPKKEEvVKSEPIPFDLVIEILLRLPAKSIARFRYVSKLWQSTLRGPHFTESFLTLSSSRPKILFTCLKDGETFFFSSPHTQDLSPISANIHMSFPVNCPSNICRPVHGWVCGSHQRTTKGTTVTVPLICNPSTGESLALCKVKTRRKGVISFLGFDPIDKKFKVLCMTRAYVGRADSEEHQVLTLETGKKPSRKMIECDILHYPTVVEHTNGFSQYDGVCINGVLYYLAIVHGVSDHRYPDVVCFEFRSDKFNYIKKVAGPGME-MYLRGQLDSTLVNYKGKLAKLQPNMSNNgVCTGIQLWVLEDAEKHEWSSHIYVLPPPWRNVYEETKLCFVGTTRKGEIVLSPNTISNFFYLLYYNPERNTITIVKIKGLETFKSHKAYTFVDHLEDVKL------
>UPI00053981B0 361 0.708 2.221E-106 0 405 410 0 407 413
MMTRRKTRSCSNPRKEEvVKPEPIPFDLVIEVLLRLPVRSVARCRSVSKLWNSTLEGPHFTELFFTLSSSRPKILFTCLKGDETVFFsSSRNPQDLS-IDANIRMSFPINSSSHICRPVRGWLCGLH-RTTKGATVTVPLICNPSTGESVPLPTVKTRRKVVISFFGYDPMEKTFKALCMTRSSVGGEDTPsgEHQVLTLGTGKTsSSREMIDCDILHHPAVVEETNGFCQYDAICINGVLYYLAVVHDVFDG-HPDIICFEFESKKFSYIKK-ADHSMGMYSGGYGLESTLVNYKGKLTQLQPNYSnDRICNGIQLLVLEDAAKHRWSTYIYVLPPPWRNMYRDTKLCFVGTTSRGEIVLSPNTISRFFYLLYYSPERNTIQIVKIKGLETFKGHKAYIFLDHVEDVKLVP----
>UPI00053A3292 360 0.698 5.686E-106 0 405 410 0 409 415
MMTRRKTRSCSKPRNEEEvvKPEPIPLDLVIEVLLRLPVRSVARCRSVSKLWNSTLEDPHFTESFFTLSSSRPKILFTCLKGDETVFFsSSPNLQDLS-TYANIRMSFPINSSSHICRPVRGWVCGLH-RTTKGATVTVPLICNPSTGESVALPTVKTRRKVVICFFGYDPIEKTFKALCMTRSSLGGEDTPsgEHQVLTLGTGKTsSSREMIDCDILHHPAVVEETNGFCQYDAICINGVLYYLAVVHDVFDG-TPDIVCFEFESQKFSYIKK-ADHGMGMYSGGYGLESTLVNYKGKLAKLQPNYSnvDRIYDGIQLLVLEDAAKHQWSSYIYVLPPPWRNIYKDDTLCFVGTTSKGEIVMSPNTISGFFYLVYYSPERNTIQIVKIKGLETFKGHKAYTFLDHVEDVKLVP----
>UPI000CD4DC39 359 0.691 1.455E-105 0 403 410 0 405 413
MMTRRKTRSCSNPRNEEVKPEPIPFDLVIEILLRSPVKSIARFRKVSKLWESTLRGPQFTESFFTLSWSRPKILFTCLKDGETVFFSLpqPHPQDPSIITANIHMSFPINCSSHICRPVRGLVCGLHRRKTKGATSTVPLICNPSTGESFPLHKVNTRRKAVISFFGYNPIDKSFKVLSMTRSSGGLSHSGEHQVLTFKTGTKgSSRKMIECDILHHPSVVEQTNGFCQYDGICINGALYYLAVVYAVS-NGYPDVVRFDLESEKFSYIK--RADHVVETYSGGHLEPTLVNYKGRLGKLHPSYSnDRACTGIQLLVLEDAGKHQWSSYIYVLPPPWMNIYDvKTKFCFVGTTVEGDIVLSPNTISDFFYLLFYSPERNTINIVGIKGMESFKGHKAYAFLDHVEDVKL------
>UPI000CED1DFB 359 0.656 1.455E-105 0 403 410 0 396 427
MMPRRKTRSF------LAKIEPIPFDLVIEILLRLPVKSIATFRRVSKLWASTLRDPSFTESYLTISSSRKKLLFTCLKDDETCFFsSSPNSQSPSSdISAKVHMSFPINCPTNICRPVRGLVCGLNQrRPSKGRTVTVPLICNPSTGQSLALPDVRTRGKRVISCFGYDPIDKQFKVLCMTLPYVGXSSSQDHQVLTLGTQKKPSWKMIKCEVPHIPVDFEHTNG-----GVCINGVLYYLAILLHVdaYTDGYFDIVSFDIRSEKFSYIKT--AVTGMRIHXGEKLESTLVNYKGKLAKLQRNIDDYgTYTGIQLWVLEDAEKHEWSSYIYVLPPPWKNIFEETTLCFVGTTSKGEIVLSPNTISDSFYLLYHNPETKTITKVGVQGMEAYKGHKAYTFLDHVEDVTL------
>A0A6D2IA77 357 0.629 6.970E-105 1 404 410 0 406 410
-MTRRRKAR-SLPVV---VFEQIPFDLVIEILLRSPVKSIGRFRSVSKLWESTIRSPDFKESFRAISSSRNNLLFTCLKDGETYFFSSPrpqvHPQKLpSPIAANVHMSFPINCPTGVCRPVRGLVCGLRQQTSKEGTVTVPLICNPSTGESLALPKVRTTKKGVMSCFGYDPIDKQFKVLCMTLSSEGGlPNSAEHQLLTLeEAKEKHSWKMIECYVRHYPYFAEHTNGFYLHDGICINGVLYYVAIVFHDEFDGYPDIACFDIRSEKFSYIK--KADEGMNVNVGEKLESTLVNYKGKLAKLQPNIGNNneGYNGIQLWVLEDAEKHEWSRHIYVFPLHRKSIFEKTRLCFVGTTSTGEIVLSPNTISDSFYLIYYNPERNTLKRVEVRGMEAYKSCKAYTFLDHVEDVTLL-----
>UPI00053B3BD0 353 0.694 1.168E-103 0 405 410 0 408 414
MMTRRKTRSCSKPRNEEvVKPEPIPFDLVIEVLLRLPVRSVARCRSVSKLWNSTLEGPHFTESFFTLSSFRPKILFTCLKGDETVFFsSSPNPQDLS-IAANIRMSFPINSSSHICRPVRGWLCGLH-RTTKGATVTVPLVCNPSTGESVPLPTLKTRRKVVITFFGYDPIVKTFKALCMTRSSVGGEDSPsgEHQVLTLGTGKTsSSREMIDCHILHHPAVVEETNGFCQYDAICINGVLYYLAVVHDVFEG-HPDIVCFEFESKKFSYIKK-ADHSMGMYSGGYGLEPTLVNYKGKLAKLQPSYSnvDRICNGIQLLVLEDAVKHQWSSYIYILPPPWMNLYGDTKLCFVGTTSRGEIVLSSNTISRFFYLLYYSPERNTIQIVKVKGLETFKGHKAYIFLDHVEDVKLVP----
>A0A654EDH0 340 0.591 4.906E-99 0 401 410 0 402 412
MKTRRNTRSCSNsSKREEKNSETIPFDLVIEILTRLPVKSIARFRCLSKLCASTLNNPDFTESFFTISSSRPKLLFTCPKDGETFFFSSPKPRDSSPLVVNFHMSFSINHLCGICRPVCGFIYGFNSHtNLKGRTISKPLICNPSTGESWPLPRVKTNRTIITSFFGYDPINKEFKVLCMTKSKFG--VFGEHQVLTFGTGKELSWRKIKCDMAHYPEVVDYeASGYPRplYDGICINGVLYYLGRVHDDLDG-FPDMVCFDIKFEKFSYIKKANGMKRN---SGVNLQPTLVNHKGKIAKLQANIGPGsiRYTGIQLWVLEDAEKHQWSSYIYVVPPPWKNIIEETKLRFVGTSDTGDIVLSPCNISNSFYLLYYNPERNAIARVEIQGMEAFKTHKSYAFLDYAENI--------
>UPI000CD4B51D 334 0.548 3.923E-97 0 406 410 0 409 412
MKTRRNTRSCSNSRNRAEKTSetiHLPFDLVIEIFMRLPAKSVARFHCLSKLCASTLSNPNFTDAFFIRSSSRPKLLFNCPKDGETFFFSSPKPrDDSSPLAVSFHKSFPINRPFDICRPVSGFVYgFNYHKTSTGRTVSVPLICNPSTGQSWTLPSVKTNRTIITSYFGYDPIDKEFKVLCMTQSYLGEF--GEQKVLTLGTGKKLSWRKIKCDMQHFPCPVEGEPnhHYPLYDGICIDGVLYYLGMVRGDADG-FPDIICFDIKSEKFSYVK--KTHGMER-NSGSVLEQTLVNYKGKIAKFQPkfDEHGTILTGIQLYVLEDAEKHQWSSYIYVMPPPWKSIVEETKLRFVGTSDTGEIVLSPYNISDSSYLLSYDPERNTLTKVGIKGMEALKPHKSYAFLDHVENVVKIEP---
>A0A087GJI2 325 0.479 5.230E-94 22 403 410 27 395 403
----------------------LPIDLIIEILSRLPAKSIARCRCVSKLWGSIIRSQVFTELVLTRSATtQPHLLFACEKNGEVFFYSSPQNpyEKSSPITANYHMKFPFDDDDFVLRPVHGLICLKQIRIFKGRNTTALMICNPSTGQSLTLPRVKTRRVDVMSFLGYDPVGKQFKLLSMTSSISGSnRVSAEHQILTLGNGKL-SWRKIECSTPHYPL----------SRGICINGVLYYPAEDKCI--EGKFRIACFDIRSEKFKLIKRVDEV----------VRGKLVNYKGKLATLRTDtSPFSICRrsrSFELCVLEDAEKHEWSTHTYVLPPLSTDLVSSSGMFFQGVTRRGEIVLSPpsYYPSDPFYLLYYNLERNTFVKVEIQGIHMHVRHKVYTFVDHVENVKL------
>UPI000901B5A9 325 0.558 5.230E-94 3 402 410 15 415 416
---RRRKRKGEERERRERERERVPFDLVIEILTRLPAKSVARFRCLSKVCASTLSNPVFIDSFSTISSSRPKLLFTCPKDGKTFFFSSPKPRDSSPLAVDFQTSFPINRPFDICRPVCGFVYGFNIHKTKGRTVSVPLICNPSKGKSWTLPRVKTNRTIITSYFGYDPicXDKEFKLLCMTRSKFGFF--EEHQVLTLATGKKLSWRKIECDMAHSPCPcpveGEAGHNYPLYDGICINGVLYYLGMVF----DGFPDIICFDIKSEKFSYAKKAHGME---LNSGSKLQPTLVNYKGKIAKFQPNFNPDYTliTGIQLWVLEDAEKHQWSSYIYVMPPPWKDIIEETKLRFVGASDTGEIVLSPYNisESDSSYLLYYDPERNTMTRVGIQGMEALKSHKAYAFLDHVKNIS-------
>V4L0Y9 322 0.625 4.670E-93 0 403 410 0 361 364
MMPRRKTRSF------LAKIEPIPFDLVIEILLRLPVKSIATFRRVSKLWASTLRDPSFTESYLTISSSRKKLLFTCLKDDETCFFsSSPNSQSPSSdISAKVHMSFPINCPTNICRPVRGLVCGLNQrRPSKGRTVTVPLICNPSTGQSLALPDVRTRGKRVISCFGYDPIDKQFKVF-----------SQDHQVLTLGTQKKPSWKMIKCEVPHIPVDFEHTNG-----GVCINGVLYYLAILLHVDAYTDGY---FDIG-EK--------------------LESTLVNYKGKLAKLQRNIDDYgTYTGIQLWVLEDAEKHEWSSYIYVLPPPWKNIFEETTLCFVGTTSKGEIVLSPNTISDSFYLLYHNPETKTITKVGVQGMEAYKGHKAYTFLDHVEDVTL------
>A0A5S9X2X0 321 0.449 1.193E-92 15 404 410 17 386 387
...
...
>MGYP000274109773 42 0.261 8.283E+00 22 61 410 15 56 204
----------------------LPFELACQILtsEHLDAMSLVRSSQVCKSWKQMCDNDEIWRK------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>M4FBP7 42 0.350 8.283E+00 8 79 410 34 110 293
--------SKQKNTSDETNSDPFPSDLLMEILKLFPVKTLARLTCVSKLWASTIRRqefNKLWSssNQQRRSSSSNTLIFAFKRD------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
My question is how to interpret the above content returned after using the run_mmseqs2 function to call the interface?