Indigo icon indicating copy to clipboard operation
Indigo copied to clipboard

substructure search

Open edgeTurbo opened this issue 2 years ago • 3 comments

My database is not a bingo supported database, so I want to use the substructureMatcher method that comes with indigo. Gives a substructure to compare to another, if it is a substructure. Returns true, otherwise, returns false. But I don't know how to use this method, can there be an example for me to refer to, thank you very much.

edgeTurbo avatar Jul 11 '22 05:07 edgeTurbo

Hi @edgeTurbo,

This is the link to the documentation for indigo.substructureMatcher method: https://lifescience.opensource.epam.com/indigo/api/index.html?highlight=indigo#molecule-substructure-matching

The full description of the substructure search with lots of queries and examples of the molecules retrieved you can find here: https://lifescience.opensource.epam.com/bingo/user-manual-oracle.html#substructure-search

AATDev21 avatar Jul 11 '22 12:07 AATDev21

Hi @AATDev21 , thanks but i used indigo.substructureMatcher method, An error occurred: this is my code:

public static void main(String[] args) {
        String queryMol = "\n" +
                "  Ketcher  7122215582D 1   1.00000     0.00000     0\n" +
                "\n" +
                "  6  6  0  0  0  0            999 V2000\n" +
                "    9.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    8.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    7.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    7.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    7.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    8.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "  1  2  2  0     0  0\n" +
                "  2  3  1  0     0  0\n" +
                "  3  4  2  0     0  0\n" +
                "  4  5  1  0     0  0\n" +
                "  5  6  2  0     0  0\n" +
                "  6  1  1  0     0  0\n" +
                "M  END\n";
        String mol = "\n" +
                "  Ketcher  7122215572D 1   1.00000     0.00000     0\n" +
                "\n" +
                "  9  9  0  0  0  0            999 V2000\n" +
                "   10.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "   10.4750   -6.5580    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    9.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    8.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    7.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    7.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    7.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    7.4750   -4.8260    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "    8.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" +
                "  1  2  1  0     0  0\n" +
                "  2  3  1  0     0  0\n" +
                "  3  4  2  0     0  0\n" +
                "  4  5  1  0     0  0\n" +
                "  5  6  2  0     0  0\n" +
                "  6  7  1  0     0  0\n" +
                "  7  8  1  0     0  0\n" +
                "  7  9  2  0     0  0\n" +
                "  9  3  1  0     0  0\n" +
                "M  END\n";
        Indigo indigo = new Indigo();
        IndigoObject matcher = indigo.substructureMatcher(indigo.loadMolecule(mol));
        IndigoObject queryMolecule = indigo.loadQueryMolecule(queryMol);
        IndigoObject match = matcher.match(queryMolecule);
        System.out.println(match == null);
    }

Normally, match should not be null, because it is a substructure of the target molecule. Can you help me figure out what the problem is? Or is there something wrong in my usage? image

edgeTurbo avatar Jul 12 '22 08:07 edgeTurbo

Hi @edgeTurbo, Sorry for the late response. I didn't get any notifications about your question. Have a look at the Ketcher snap: both your molecule and query molecule have benzene rings in the structure. So you should aromatize them first and then use in subtsructure search. A code example for python is below:

>>> from indigo import Indigo

>>> i = Indigo()

>>> str_mol = "\n" + "  Ketcher  7122215572D 1   1.00000     0.00000     0\n" + "\n" + "  9  9  0  0  0  0            999 V2000\n" + "   10.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "   10.4750   -6.5580    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    9.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    8.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    7.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    7.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    7.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    7.4750   -4.8260    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    8.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "  1  2  1  0     0  0\n" + "  2  3  1  0     0  0\n" + "  3  4  2  0     0  0\n" + "  4  5  1  0     0  0\n" + "  5  6  2  0     0  0\n" + "  6  7  1  0     0  0\n" + "  7  8  1  0     0  0\n" + "  7  9  2  0     0  0\n" + "  9  3  1  0     0  0\n" + "M  END\n"
>>> str_query = "\n" + "  Ketcher  7122215582D 1   1.00000     0.00000     0\n" + "\n" + "  6  6  0  0  0  0            999 V2000\n" + "    9.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    8.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    7.9750   -7.4240    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    7.4750   -6.5580    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    7.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "    8.9750   -5.6920    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n" + "  1  2  2  0     0  0\n" + "  2  3  1  0     0  0\n" + "  3  4  2  0     0  0\n" + "  4  5  1  0     0  0\n" + "  5  6  2  0     0  0\n" + "  6  1  1  0     0  0\n" + "M  END\n"

>>> mol = i.loadMolecule(str_mol)
>>> query = i.loadQueryMolecule(str_query)

>>> matcher = i.substructureMatcher(mol)
>>> matcher.match(query)
None          <---------------------------------------- NO MATCH!

>>> mol.aromatize()
1
>>> query.aromatize()
1
>>> matcher.match(query)
<indigo.IndigoObject object at 0x111367340>        <------ MATCH!

AATDev21 avatar Jul 29 '22 11:07 AATDev21