padelpy icon indicating copy to clipboard operation
padelpy copied to clipboard

Inconsistent descriptors values for same SMILES.

Open dinabandhu50 opened this issue 3 years ago • 4 comments

Hi I am using this library for my projects and found out that there are some descriptors which will give different values for different run.

In the below figures the x-axis is different SMILES samples i.e. total of 128 samples, and the y-axis is the values calculated by padel-descriptor for topoRadius, topoDiameter and WPATH. Unfortunately because of LICENSE issues I cannot post here the dataset or any SMILES values for reproducibility but.

run 1 run_1

Here we can see the high-values are occurring at - 12, 28, 32, 37 e.t.c

run2 run_2

But in the second run the high values are at - 13, 25, 28, 33 e.t.c which is saying that different values for same set of SMILES.

  1. Is this a common problem ?
  2. How to handle this problem ?
  3. Why would padel descriptor give such extreme values ?

Thanks

  • Dinabandhu

dinabandhu50 avatar Jul 14 '21 04:07 dinabandhu50

if it's 3D descriptors you may expect those kind of issues. can you reproduce the issue with classical/free smiles ?

thegodone avatar Aug 25 '21 13:08 thegodone

@dinabandhu50 still having issues with this? The underlying descriptor calculations are a bit outside my wheelhouse...

tjkessler avatar Sep 08 '22 23:09 tjkessler

@tjkessler I was using multiple threads to calculate padel-descriptors, using from padelpy import padeldescriptor, which was giving me above issue, later I used single thread and it solved the problem.

But actually padelpy also gives output with corresponding SMILES name when done multithreaded way. so one can also take advantage of that.

Again thanks a lot for the software,

dinabandhu50 avatar Sep 09 '22 05:09 dinabandhu50

Uploading image.png…Hi,there are other types od Fingerprint, How do I get them? Thanks

Zuosq avatar Nov 14 '22 14:11 Zuosq