TDC icon indicating copy to clipboard operation
TDC copied to clipboard

(new feature) large molecule multi-objective fluorescent protein oracle

Open samuelstanton opened this issue 2 years ago • 4 comments

Describe the problem It would be great to add the large molecule fluorescent protein task introduced here as a molecular generation oracle. Code is available here.

Describe the solution you'd like

In brief the oracle would return the folding stability (-dG) and solvent accessible surface area (SASA), given a primary amino acid residue sequence and a reference PDB. The oracle would first translate the input sequence to a list of substitution mutations, then use FoldX to generate a predicted structure and the predicted folding stability. SASA can be computed from the predicted structure using BioPython.

Additional context I would potentially be willing to make a PR if this is something the maintainers would approve.

samuelstanton avatar Apr 08 '22 22:04 samuelstanton

Hi Samuel, this sounds like a great oracle to include in TDC! We usually implement individual oracle function at here while the tdc.Oracle interface is the wrapper around all oracles. Let us know if you need further pointers and happy to discuss more!

kexinhuang12345 avatar Apr 14 '22 03:04 kexinhuang12345

Hi @samuelstanton any thought on this? no rush!

kexinhuang12345 avatar Apr 27 '22 16:04 kexinhuang12345

Hi Kevin, haven't had time to draft a PR bc of some impending deadlines. I'm hoping to defend my dissertation in July so realistically it'll probably be a while before I'll be able to give this proper attention. If you like we can close the issue for now, or leave it open until I have more time, up to you.

samuelstanton avatar Apr 28 '22 15:04 samuelstanton

good luck with your defense and no rush!

futianfan avatar Apr 28 '22 16:04 futianfan