François Bérenger
François Bérenger
related to https://github.com/tsudalab/fmqa/issues/4
Example use: ``` # ./bin/fmqa_model.py -i data/test_data.csv --NxCV 5 13:27:39 (lines, cols): 741 120 13:27:40 (R2, RMSE)_train: 0.967 0.020 13:27:40 NxCV: 5 13:27:41 fold 0 (R2, RMSE)_test: 0.564 0.064 13:27:42...
If this PR is accepted, I'll add the code to save/load a model for production use.
Done, but needs to test to see if useful in practice. E.g. ``` molenc_dense --bloom 5,50 -n 3214 -i data/x_std_01.txt > test_bloom.csv ```
investigate Bloom filter or MinHash or LSH for current milenial FPs
does not outperform ECFP4 2048b in a regression benchmark
maybe related to this one: https://github.com/rdkit/rdkit/issues/7554
I also have a test script where a.GetTotalNumHs() returns 0, in case an sdf file was read in, had already the hydrogens in there and removeHs=False was passed to the...
This is really ugly. I spent hours hunting down what was wrong in some code I was sure was 100% correct, until I started to doubt about rdkit...
By the way, where is the actual periodic table stored? Is this in Data/rddata.sql? I was looking for a csv file or some hardcoded values in a header file without...