foldcomp
foldcomp copied to clipboard
Subsetting databases
Hi,
Thank you for the great resource!
I am having trouble subsetting databases and decompressing subsets of the databases you provide here: https://foldcomp.steineggerlab.workers.dev
According to the instructions, I should be able to decompress a subset of a database given an "id_list.txt".
This is how I do it for e.g. A. thaliana:
head -n 1 data/a_thaliana.lookup 0 AF-A0A178UFC4-F1-model_v4.pdb 0
As I understand it, the ID here is "AF-A0A178UFC4-F1-model_v4".
Now, I write this into a file called id_list.txt, then I run the command: foldcomp decompress --id-list id_list.txt data/a_thaliana
with the response: Decompressing files in data/a_thaliana using 1 threads Output directory: data/a_thaliana_pdb/ [Warning] AF-A0A178UFC4-F1-model_v4 not found in database.
I have tried many different ways of naming the ids based on what is in a_thaliana.lookup, but nothing seems to work. The same using mmseqs to subset the database: """ createsubdb --subdb-mode 0 --id-mode 1 id_list.txt a_thaliana test_sel/output_foldcomp_db
MMseqs Version: ad6dfc66d7bbc4fd626fc19adf10ba587bc137c4 Subdb mode 0 Database ID mode 1 Verbosity 3
Could not find name AF-A0A178UFC4-F1-model_v4 in lookup Time for merging to output_foldcomp_db: 0h 0m 0s 1ms Time for processing: 0h 0m 0s 34ms """
Can you please explain what I am doing wrong and how to properly specify the IDs?
Best,
Patrick