graphium
graphium copied to clipboard
Graphium: Scaling molecular GNNs to infinity.
Implented in C++ for featurization and preprocessing optimizations, along with a few other optimizations, significantly reducing memory usage, disk usage, and processing time for large datasets. ## Changelogs - _enumerate...
## Changelogs - Relaxing the constraints on Torchmetrics version - Changing the code to use the `update` and `compute` to avoid memory issues with large validation set --- _Checklist:_ -...
A new `ordered=True` option has been introduced to deal with an rdkit issue where the smiles where not re-ordered correctly. 1. Document in Datamol that `ordered=True` is also useful when...
My understanding is that assert statements can be removed at compile time (to improve performance, which seems specifically relevant to Graphium), so any error that a downstream user can make...
Replace the `pre-nn` and `pre-nn-edges` by the `MLPEncoder`. Also allow `pre-nn-graph` and `pre-nn-atten` (I think this would come naturally). In `MLPEncoder`, replace the use of the `MLP` class by the...
We should try to remove that after. I think it's because of `torchmetrics` _Originally posted by @DomInvivo in https://github.com/datamol-io/graphium/pull/510#discussion_r1567440176_
## Changelogs - Allow having 0 GNN layers (i.e. only the task heads) on the last IPU in a pipeline split --- _Checklist:_ - [ ] _Was this PR discussed...
## Changelogs - _enumerate the changes of that PR._ --- _Checklist:_ - [ ] _Was this PR discussed in an issue? It is recommended to first discuss a new feature...
## Changelogs - [x] Find the canonical ordering of every molecule - [x] Find a mapping between the order of the featurized molecule, and the molecule associated to the labels...
Appendix D of the Tensor Programs V paper contains a number of practical suggestions for using muP which we would do well to consider, such as: - fixing the dimension...