rdkit-tutorials
rdkit-tutorials copied to clipboard
Need for tutorial on how to save .svgs
In Jupyter, it's now quite easy to display beautiful .svg-based renderings of molecules and molecule grids.
I've been unable to figure out how to save the beautiful .svg files in a way that they can be opened in the browser.
-
Perhaps the right the right way to do this is in http://www.mail-archive.com/[email protected]/msg06125.html but it would be great not to have to wade through mailing list archives and message trees to figure this out...
-
This page looked helpful but didn't adapt to Draw._MolsToGridSVG() for me. http://biyoenformatik.blogspot.com/2015/09/generating-2d-svg-images-of-mol-files.html
-
Maybe long term, the best way to support this is by adding code in rdkit so that something like
Draw.MolToFile(mol, fileName='my_mol.svg')
works instead of giving aNameError
?
Q.v. also this recent message from Hongbing Yang to the rdkit-discuss mailing list:
Hi, everyone, I want to draw two molecules in a svg file with rdMolDraw2D. When I executed the following code, the jupyter cracked without any error or warning. ``` drawer = rdMolDraw2D.MolDraw2DSVG(400,400) i=0 for mol in mols: if mol.HasSubstructMatch(smarts): rdDepictor.Compute2DCoords(mol) #if i == 1: # continue drawer.DrawMolecule(mol,highlightAtoms=mol.GetSubstructMatch(smarts)) i+=1 if i > 1: break drawer.FinishDrawing() svg = drawer.GetDrawingText().replace('svg:','') SVG(svg) ```It seems that we cannot directly draw two molecules with the same drawer? So how can I draw as I wanted?
By the way, I've no idea why it cracked. From the experiment of the commented code, I can conclude it was caused by the drawer. So is it possible to fix the bug, adding error or warning instead of "KernelRestarter: restarting kernel" in console.
It looks like MolToFile()
is fixed on GitHub. So I know I can wait until the next release for easy ways to create .svgs of single molecules. I'm still not sure about grid images. I'd love to be able to export .svgs where molecule legend text comes through as text, for example.
MolToFile()
is still using the old drawing code as far as I can tell.
Agreed that a general cleanup of the contents of the rdkit.Chem.Draw module is necessary. And should probably be done before a tutorial is written.
MolsToGridImage by default produces SVG:
img = Draw.MolsToGridImage([Chem.MolFromSmiles(x) for x in ['C', 'CO', 'CN']])
print(type(img))
<class 'IPython.core.display.SVG'>
You can easily get the SVG data and save the text in a file:
print(img.data)
<svg baseProfile="full" height="200px" version="1.1" width="600px" xml:space="preserve" xmlns:rdkit="http://www.rdkit.org/xml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g transform="translate(0,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#000000" x="84.4958" y="108.25"><tspan>CH</tspan><tspan style="baseline-shift:sub;font-size:11.25px;">4</tspan><tspan/></text>
</g>
<g transform="translate(200,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<path d="M 9.09091,100 59.1479,100" style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<path d="M 59.1479,100 109.205,100" style="fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#FF0000" x="109.205" y="107.5"><tspan>OH</tspan></text>
</g>
<g transform="translate(400,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<path d="M 9.09091,100 55.1606,100" style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<path d="M 55.1606,100 101.23,100" style="fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#0000FF" x="101.23" y="108.25"><tspan>NH</tspan><tspan style="baseline-shift:sub;font-size:11.25px;">2</tspan><tspan/></text>
</g></svg>
Thanks @samoturk. That's very useful.
However, a couple of caveats: When I copy that text into a file and name it foo.svg
, my browser [Chrome 56.0.2924.87 (64-bit)] cannot open it. Instead it displays This XML file does not appear to have any style information associated with it. The document tree is shown below.
That said, when I rename the file foo.html
, my browser can render images from the svg-like information.
However, properly formatted *.svg
files can be opened by Chrome and rendered into images even when they have the proper *.svg
extension. An example of such a file is at https://upload.wikimedia.org/wikipedia/commons/0/0c/Anecortave_acetate.svg
This may seem like a minor issue but I can assure you it has created very large amounts of confusion, for me for sure, and probably for others, judging by the mailing list archive. It also makes it harder to import rdkit "svg" files into applications.
Add in fact, if I replace the rdkit-generated svg open tag (i.e. <svg [STUFF]>
) with the svg open tag from the Wikipedia file, Chrome can render the document, even when it is called foo.svg
.
That is, this version renders:
<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 16.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg version="1.1" id="Слой_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px"
width="485.926px" height="448.539px" viewBox="0 0 485.926 448.539" enable-background="new 0 0 485.926 448.539"
xml:space="preserve">
<g transform="translate(0,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#000000" x="84.4958" y="108.25"><tspan>CH</tspan><tspan style="baseline-shift:sub;font-size:11.25px;">4</tspan><tspan/></text>
</g>
<g transform="translate(200,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<path d="M 9.09091,100 59.1479,100" style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<path d="M 59.1479,100 109.205,100" style="fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#FF0000" x="109.205" y="107.5"><tspan>OH</tspan></text>
</g>
<g transform="translate(400,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<path d="M 9.09091,100 55.1606,100" style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<path d="M 55.1606,100 101.23,100" style="fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#0000FF" x="101.23" y="108.25"><tspan>NH</tspan><tspan style="baseline-shift:sub;font-size:11.25px;">2</tspan><tspan/></text>
</g></svg>
but this one does not:
<svg baseProfile="full" height="200px" version="1.1" width="600px" xml:space="preserve" xmlns:rdkit="http://www.rdkit.org/xml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g transform="translate(0,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#000000" x="84.4958" y="108.25"><tspan>CH</tspan><tspan style="baseline-shift:sub;font-size:11.25px;">4</tspan><tspan/></text>
</g>
<g transform="translate(200,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<path d="M 9.09091,100 59.1479,100" style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<path d="M 59.1479,100 109.205,100" style="fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#FF0000" x="109.205" y="107.5"><tspan>OH</tspan></text>
</g>
<g transform="translate(400,0)"><rect height="200" style="opacity:1.0;fill:#FFFFFF;stroke:none" width="200" x="0" y="0"> </rect>
<path d="M 9.09091,100 55.1606,100" style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<path d="M 55.1606,100 101.23,100" style="fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"/>
<text style="font-size:15px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;text-anchor:start;fill:#0000FF" x="101.23" y="108.25"><tspan>NH</tspan><tspan style="baseline-shift:sub;font-size:11.25px;">2</tspan><tspan/></text>
</g></svg>
The key attribute of the proper <svg>
open tag seems to be the namespace declaration.
https://developer.mozilla.org/en-US/docs/Web/SVG/Namespaces_Crash_Course
I can edit down the <svg >
open tag from the "successful" foo.svg I described above to this, @samoturk's images of methane, ethanol, and ethyl amine will render:
<svg version="1.1" xmlns="http://www.w3.org/2000/svg" >
rdkit's native svg-like output does have many xmlns
declarations, but none of them seem to do the job:
xmlns:rdkit="http://www.rdkit.org/xml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"
Further experimentation reveals that just changing the xmlns:svg="http://www.w3.org/2000/svg"
to xmlns="http://www.w3.org/2000/svg"
in the native rdkit svg code allows Chrome to render it successfully. (I'll leave it to those wiser than me to determine if this is a bug in rdkit, in Chrome, in the w3c consortium's definition of the SVG namespace, or something else entirely.)
I only ever used svg data to embed it in html as a part of web services, where it works. I never used it to store actual svg files so I never noticed this problem.
You are right, it seems that RDKit is not declaring the name space correctly. Replacing:
xmlns:svg="http://www.w3.org/2000/svg"
with xmlns="http://www.w3.org/2000/svg"
fixes the rendering also in Firefox. Both versions of the file work in default Gnome image viewer.
Bug @greglandrum ?
For relatively large grids, f.write(img.data) stops writing after about 1099 lines. Any ideas?
That's unexpected. What if you write line by line?
I managed to miss this issue when it originally came in.
On the namespaces thing: the svg generation code actually prefixes all svg elements with svg:
. This seems to work when storing the svg files, but fails when used with the jupyter integration. So the jupyter code tends to remove the prefix.
I can try setting the generic namespace (or whatever the xmlns
thing is) as part of the next release cycle and we can see if that helps.
@samoturk turns out I forgot to close the file :)
This is an integrated example:
def draw_multiple_mol(smiles_list, mols_per_row=4, file_path=None):
mols = []
for i in smiles_list:
mols.append(Chem.MolFromSmiles(i))
mols_per_row = min(len(smiles_list), mols_per_row)
img=Draw.MolsToGridImage(mols, molsPerRow=mols_per_row, subImgSize=(300, 160), useSVG=True)
if file_path:
with open(file_path, 'w') as f_handle:
f_handle.write(img.data)
return img
smiles = ['CCCS(=O)c1ccc2[nH]c(=NC(=O)OC)[nH]c2c1', 'CCOC(=O)c1cncn1C1CCCc2ccccc21']
draw_multiple_mol(smiles, file_path='two_mols.svg')
Using rdkit
version 2019.09.1
and jupyter
version 1.0.0
this one works like a charm:
with open('grid.svg', 'w') as f:
f.write(grid.data)
Note: You might need to set useSVG=True
when calling MolsToGridImage
.