PADME Refactoring the code

Refactoring the code

Open simonfqy opened this issue 6 years ago • 1 comments

The current code is okay but some of the scripts are too complicated to understand, like splits/splitters.py, metrics/__init__.py, ./NCI60_data/preprocess.py, often with large chunks of duplicated code. Some improvements are desired, especially in the splits/splitters.py, currently it does not allow some parameter combinations and uses assert() functions as a way to fail early. I will try to solve this problem in a more graceful manner.

Cleanups are also needed in some files.

Thresholding the continuous predictions to yield binary outputs is currently done in a hard-coded manner, which could be prone to errors. Will need to refactor it if necessary. Also the range estimation is implemented in DeepChem using Bayesian statistics, possibly I need to incorporate this into the code as well.

Sep 26 '18 21:09 simonfqy

Now the code is much more modularized, though there are some remaining problems in some of the scripts, which I was a bit lazy to fix and simply chose a "quick and dirty" solution. Need to fix it. Since DeepChem is actively maintained but this repository might not be so, I need to decouple the two repos, such that no imports from DeepChem would be necessary for the repo to function correctly, i.e., make it self-contained without dependence on DeepChem.

Mar 01 '19 22:03 simonfqy

PADME PADME copied to clipboard

Refactoring the code

PADME
PADME copied to clipboard