Error parse example
(E)-1-Cyclohexyl-4,4,4-trifluorobut-2-en-1-yl-2-cyano-2-diazoacetate
(±)-(1R,5R,3R)-N-Benzyl-1-cyano-2-(hydroxymethyl)-3-(trifluoromethyl)cyclopropane-1-carboxamide
(±)-(1R,5R,3R)-N-Benzyl-1-cyano-2-(hydroxymethyl)-3-(pentafluoro-λ6-sulfanyl)cyclo-propane-1-carboxamide
I picked up some example with error parsing chemical names from literature, maybe can use them to improve the OPSIN.
(E)-1-Cyclohexyl-4,4,4-trifluorobut-2-en-1-yl-2-cyano-2-diazoacetate should presumably be (E)-1-Cyclohexyl-4,4,4-trifluorobut-2-en-1-yl 2-cyano-2-diazoacetate. OPSIN does try and infer when a hyphen was intended to be a space, but isn't perfect at doing so.
For (±)-(1R,5R,3R)-N-Benzyl-1-cyano-2-(hydroxymethyl)-3-(trifluoromethyl)cyclopropane-1-carboxamide and (±)-(1R,5R,3R)-N-Benzyl-1-cyano-2-(hydroxymethyl)-3-(pentafluoro-λ6-sulfanyl)cyclo-propane-1-carboxamide this looks like a mistake, are you sure the prefix on these isn't (±)-(1R,2R,3R)
(E)-1-Cyclohexyl-4,4,4-trifluorobut-2-en-1-yl-2-cyano-2-diazoacetateshould presumably be(E)-1-Cyclohexyl-4,4,4-trifluorobut-2-en-1-yl 2-cyano-2-diazoacetate. OPSIN does try and infer when a hyphen was intended to be a space, but isn't perfect at doing so.For
(±)-(1R,5R,3R)-N-Benzyl-1-cyano-2-(hydroxymethyl)-3-(trifluoromethyl)cyclopropane-1-carboxamideand(±)-(1R,5R,3R)-N-Benzyl-1-cyano-2-(hydroxymethyl)-3-(pentafluoro-λ6-sulfanyl)cyclo-propane-1-carboxamidethis looks like a mistake, are you sure the prefix on these isn't(±)-(1R,2R,3R)
I think you are right, if I make the changes you suggest, it will parse correctly. However, I am not an specialist in this area, I am just trying to process the chemical names extracted from the supporting information of a literature. The original text is written like this, and I can even convert it into a structure using ChemDraw, so I am curious if this is caused by some irregular writing.