practicalcheminformatics icon indicating copy to clipboard operation
practicalcheminformatics copied to clipboard

Assessing Interpretable Models | Practical Cheminformatics

Open utterances-bot opened this issue 3 years ago • 2 comments

Assessing Interpretable Models | Practical Cheminformatics

Understanding and comparing the rationale behind machine learning model predictions

https://patwalters.github.io/practicalcheminformatics/jupyter/ml/interpretability/2021/06/03/interpretable.html

utterances-bot avatar Jun 10 '21 11:06 utterances-bot

Beautiful! I really enjoy this topic of interpretability of models. Could you comment on the problem (if there is one) of using a 1024-bit fingerprint to train a ML model with "not so many" molecules? I remember reading that your samples:features ratio should be at least 5:1, but it is hard to find 5000 molecules for a lot of specific QSAR tasks. By the way, there seems to be a small formatting problem with the formula after "Matveieva and Polishchuk define a topn score as".

rflameiro avatar Aug 14 '21 01:08 rflameiro

A lot of the ideas behind the "5:1 rule" come from linear regression and aren't relevant to modern ML techniques like ensemble methods and neural nets, which have alternate methods for dealing with overfitting.

PatWalters avatar Aug 29 '21 00:08 PatWalters