manyfold
manyfold copied to clipboard
How can I get `tfrecord` data of proteins?
Thank you for your work to help train/fine-tune AF2/OpenFold/pLMFold models.
I tried to run pLMFold's training using my own protein datasets, but couldn't figure out how proteins' tfrecord data can be obtained.
Reading Paper, Supplementary Data and README didn't help me because it has no descriptions in detail about obtaining tfrecord
data.
I tried to make use of AF2 modules to get those data.
It seems to work but I found that some features written in the paper are missing in features generated by correspondent AF2 codes(template_all_atom_exists
and pdb_cluster_size
).
How could I obtain necessary features from my own proteins' data to train/fine-tune the model? Is there any tool to do so?
I need your help.
Ref. #7
I have the same problem. Could you give me a help?
And if you used only AF2 modules to collect training data, could you tell me which AF2 modules (functions/methods) you used?