torchchem
torchchem copied to clipboard
Data structure for molecules
I noticed some existing code bases from previous work. Since we are heading towards using torch_geometric
, which itself has pretty complete data structures for graphs, should we just directly use those? Otherwise I suppose we need to write codes to port our structures to torch_geometric compatible ones.
Also naming is a bit not intuitive:
neural_fp.py
is mostly on mol-graph data structures. I didn't see a ECFP (or other fingerprint) function.
transformer.py
defines many torch nn modules, many of which can be found in torch_geometric
I believe.
I suggest organizing things the same way as deepchem? Such as dividing into subfolders for data manipulation, fingerprint, nn models, etc.
@miaecle This seems like a good idea! Would you have any pytorch-geometric example code that does this?
Sorry, copied in code from a couple of sources as starters so it's all jumbled in. I have an open PR that cleans it up a bit into the deepchem structure. I'll go ahead and merge that in so the repo looks a little cleaner.
The neural_fp.py
and transformer.py
code are from https://github.com/gmum/MAT originally, so they haven't been reworked yet.
@rbharath Sounds good, I can work on some basic data function to port things into torch_geometric.
@miaecle Great! That would be a very valuable contribution :)