ArDCA.jl icon indicating copy to clipboard operation
ArDCA.jl copied to clipboard

What's the location of the trained model when generating a sample from given MSA

Open psp3dcg opened this issue 11 months ago • 5 comments

Thanks for the really nice package! A question (rather than an issue): is there any official trained model and where does it exist in package? Thank U :)

psp3dcg avatar Mar 12 '24 13:03 psp3dcg

Hi @psp3dcg. Sorry for the late reply. We do not save any "officially trained model", particularly, because the training is very fast. How large is the dataset of interest for you?

pagnani avatar Mar 25 '24 12:03 pagnani

OK, thank you for the reply~

psp3dcg avatar Mar 25 '24 12:03 psp3dcg

Hi @psp3dcg, I saw you reopened this. Anything else I can do here?

pagnani avatar Mar 29 '24 16:03 pagnani

Hi @psp3dcg, I saw you reopened this. Anything else I can do here?

Oh, yes. I want to ask the name of protein families you used during the training except for PF00072 and PF00196, and the number of sequences in each family. Thank you :)

psp3dcg avatar Mar 31 '24 14:03 psp3dcg

We used ArDCA for many protein families. In the Readme of the package we mention a companion repo ArDCAData with data from 5 protein families (PF00014, PF00072, PF00076, PF00595, PF13354). If you need more, MSA for all protein families can be found Interpro.

pagnani avatar Apr 02 '24 08:04 pagnani