retrosim icon indicating copy to clipboard operation
retrosim copied to clipboard

Need the input chemical reaction santisfy atomic conservation?

Open hcji opened this issue 5 years ago • 8 comments

Hi, GOOD Work! I m trying to extract reaction templates of other chemical reactions via your code. The chemical reactions were atom mapped, but not santisfy atomic conservation. Can it work correctly?

hcji avatar Feb 26 '19 08:02 hcji

I ask because I get something wrong:

reaction_smiles = '[O:1]=[S:2]([O-:3])[O-:4]>>[O:1]=[S:2]([OH:4])(=[O:5])[OH:3]'
template = extract_one_template(reaction_smiles)        ## function from your code
c1 = reaction_smiles.split('>>')[0]       ## reactant
c2 = reaction_smiles.split('>>')[1]       ## product
rxn = AllChem.ReactionFromSmarts(template)
prod = rxn.RunReactants([Chem.MolFromSmiles(c2)])   ## calculated reactant
AllChem.CalcExactMolWt(Chem.MolFromSmiles(c1))     ## mass of true reactant, get 79.9579
AllChem.CalcExactMolWt(prod[0][0])                    ## mass of calculated reactant,  get 95.95

How can I fix this?

hcji avatar Feb 26 '19 08:02 hcji

Reactions don't necessarily need to satisfy atom conservation in that leaving groups can be absent from the reaction products, but they aren't meant to create mass. If you use the following reaction SMILES, where all atom mapped product atoms that do not appear in the reactants are included as separate fragments, you can get the behavior you want:

reaction_smiles = '[O:1]=[S:2]([O-:3])[O-:4].[*:5]>>[O:1]=[S:2]([OH:4])(=[O:5])[OH:3]'

You can write a short script to add fragments to your reactants SMILES. It might be worth keeping track of these fragments so you can remove them later. Some code that might be useful:

def fix_reaction_smiles(smiles):
    rcts = Chem.MolFromSmiles(smiles.split('>')[0])
    prds = Chem.MolFromSmiles(smiles.split('>')[2])
    rct_maps = set(a.GetAtomMapNum() for a in rcts.GetAtoms() if a.GetAtomMapNum())
    frags = []; symbs = []
    for a in prds.GetAtoms():
        if a.GetAtomMapNum() and a.GetAtomMapNum() not in rct_maps:
            frags.append('[*:{}]'.format(a.GetAtomMapNum()))
            symbs.append(a.GetSymbol())
    if not frags:
        return smiles, []
    return '{}.{}'.format('.'.join(frags), smiles), symbs

reaction_smiles = '[O:1]=[S:2]([O-:3])[O-:4]>>[O:1]=[S:2]([OH:4])(=[O:5])[OH:3]'
reaction_smiles_fixed, symbs = fix_reaction_smiles(reaction_smiles)

template = extract_one_template(reaction_smiles_fixed)        ## function from your code
c1 = reaction_smiles.split('>>')[0]       ## reactant
c2 = reaction_smiles.split('>>')[1]       ## product
rxn = AllChem.ReactionFromSmarts(template)
outcomes = rxn.RunReactants([Chem.MolFromSmiles(c2)])   ## calculated reactant
for outcome in outcomes:
    reactants = [Chem.MolToSmiles(mol) for mol in outcome]
    for symb in symbs:
        reactants.remove(symb)
    print(reactants)
    print(AllChem.CalcExactMolWt(Chem.MolFromSmiles('.'.join(reactants))))

AllChem.CalcExactMolWt(Chem.MolFromSmiles(c1))     ## mass of true reactant, get 79.9579

connorcoley avatar Mar 30 '19 14:03 connorcoley

hi: good work! I want to reappear your work. Unfortunately, in get_date.py,the last line code called write_to_files(data) have a IndentationError: unexpected indent in pycharm. I wonder that whether your codes must run in Jupyter Notebook. Furthermore, in your paper called computer-assisted retrosynthesis based on molecular similarity, i learn the top-n accuracy and want to analysis your prediction(smiles) .could you offer me those files in you experiment?

chengyunzhang avatar Aug 28 '19 01:08 chengyunzhang

Hi @chengyunzhang -- no, these don't have to run inside a Jupyter notebook. I don't know why you're getting an indentation error there.

All of the code/data for that paper is inside this repository. The code in the test script is designed to save only the ranks; if you look at this line, you can look at the actual SMILES recommended for each by examining sorted(probs.iteritems(), key=lambda x:x[1], reverse=True)

connorcoley avatar Sep 04 '19 15:09 connorcoley

Thank you for your reply,I successfully get SMILES that i want by following your guides. best wishes!

chengyunzhang avatar Sep 05 '19 01:09 chengyunzhang

hi:From your data,i only get the maping-smiles(rxn_smiles),can your offer the original smiles without maping? Thanks

chengyunzhang avatar Feb 18 '20 05:02 chengyunzhang

You can remove the atom mapping using your cheminformatics toolkit of choice (e.g., rdkit)

On Tue, Feb 18, 2020 at 00:30 chengyun [email protected] wrote:

hi:From your data,i only get the maping-smiles(rxn_smiles),can your offer the original smiles without maping? Thanks

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/connorcoley/retrosim/issues/3?email_source=notifications&email_token=ABAEXJS6B6VAHLAZPSOO2UDRDNW7TA5CNFSM4G2GLVX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMAULWQ#issuecomment-587285978, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAEXJUNTT644EPIVIIKIU3RDNW7TANCNFSM4G2GLVXQ .

connorcoley avatar Feb 18 '20 12:02 connorcoley

Thanks to your guides.

chengyunzhang avatar Feb 19 '20 02:02 chengyunzhang