foldseek
foldseek copied to clipboard
MGnify database hits not found on MGnify website
Using the FoldSeek website, many of the hits with highest bit score are from the MGnify database, but using either the identifier (MGYP + digits) or protein sequence I am not able to find the protein when searching on MGnify website.
Could you please post a link to a search result?
Is the file I uploaded appropriate?
I tried to access the top three hits from the ESMatlas and could access all of them. (1) https://esmatlas.com/explore/detail/MGYP001476177674 (2) https://esmatlas.com/explore/detail/MGYP001575997564 (3) https://esmatlas.com/explore/detail/MGYP002782136678 What ID does not work?
@twaksman001 The ESM2 models are not integrated in the MGnify database or easily searchable in the EBI site, like it is the case of AlphaFold2 that is integrated with the EBI seq dbs.
If you want to download the model either access it like the previous comment by @martin-steinegger or using the ESM2 Atlas API (see: https://esmatlas.com/about#api)
as in:
aria2c https://api.esmatlas.com/fetchPredictedStructure/MGYP001476177674
@yeojingi you might be able to help here.
The MGYP id is invented in the latest MGnify paper. But it seems that they are not supporting the interactive exploration in the website of MGYPs. But you can access the ftp storage and download annotations - Pfam, biomes. For the case of information of predicted structures - pLDDT, pTM, model versions, the ESMfold document is providing the metadata for it.