skipatom
skipatom copied to clipboard
Is there going to be a Matbench submission related to skipatom?
Noticed that you mention the Matbench dataset in the README
Not exactly the right place for this, but noticed that you starred the CrabNet repo. You might be interested in a bit more portable/extensible version that I'm working on bringing into the main repository (recently granted write access to do this). https://github.com/sparks-baird/CrabNet (docs)
Hi @sgbaird! I do plan to make a submission to Matbench related to SkipAtom. This is on my TODO list, and I hope to get to it shortly. Thanks for the link to the refactored CrabNet repo. I have been using the CrabNet approach in another project I'm working on, and it appears to outperform the ElemNet and IRNet models on the tasks in that project. My project uses Keras, so I ported over the original PyTorch CrabNet implementation to Keras. I use the SkipAtom embeddings as inputs to CrabNet, and they work well. This is all in a private repo at the moment, but I plan to release it eventually. I look forward to seeing CrabNet evolve!
I've put together a number of Matbench submissions so far - happy to help if you run into issues. I'm curious to see if skipatom
would be chosen during a recent high-dimensional hyperparameter optimization study I did https://doi.org/10.48550/arXiv.2203.12597
That's awesome to hear that your ported PyTorch CrabNet over to a Keras implementation. Excited to see that when it becomes public. Thanks for checking in on these issues!
Another repository that might be of interest to you (if you're not already aware) is RooSt, currently housed in the aviary.
I will definitely reach out to you once I begin looking at the Matbench submission @sgbaird, thanks! That's a very interesting and useful study you put together, thanks for sharing!
Regarding RooSt, I haven't looked into the approach as much as I have for CrabNet, but it's certainly a very interesting approach as well.
Thanks as well for submitting the issues, questions and comments. I'm always eager to collaborate with others, so please feel free to reach out with any more issues or suggestions you may have!
@lantunes fantastic, and thank you! Please do reach out; I'm happy to collaborate as well! I think @sp8rks would be excited to hear about your re-implementation of CrabNet in Keras.
One of the main things related to a Matbench submission for e.g. the matbench_expt_gap
task is, as mentioned in the paper linked above, I left out certain elemental featurizers (e.g. Oliynyk) because they didn't have feature vectors for all of the atoms that were represented in the Matbench dataset. I didn't track how many atoms weren't represented, as that was peripheral to the main message; however at first glance at the CSV #6 I did wonder if SkipAtom might have the same issue. Happy to try this out and see if you'd like.
Hi @sgbaird! Sorry for the delay in replying to your messages. I've been unusually busy lately. Currently, SkipAtom supports only 86 atoms. This was not discussed in the paper, but it is an important aspect. However, for the matbench datasets, I didn't encounter any instances where I wasn't able to featurize a compound because of unsupported atoms. The matbench_expt_gap
dataset has 4,604 examples (from what I see reported on the matbench website), and I was able to featurize all the compounds. I believe this is the case for all the other matbench datasets as well.