robot
robot copied to clipboard
unable to convert owl format to obo
-Hi,
last version of robot failed to convert this owl file: https://raw.githubusercontent.com/stemcellontologyresource/OSCI/master/src/ontology/osci.owl
is there a specific parameter i need to add ? Thank you.
java -jar ./robot.jar convert --input osci.owl --format obo --output osci.obo
ERROR Input ontology contains 3 triple(s) that could not be parsed:
- http://purl.obolibrary.org/obo/OBI_0000958 http://www.w3.org/1999/02/22-rdf-syntax-ns#type _:genid2147487981.
- http://purl.obolibrary.org/obo/OBI_0000998 http://www.w3.org/1999/02/22-rdf-syntax-ns#type _:genid2147487983.
- http://purl.obolibrary.org/obo/OBI_0000979 http://www.w3.org/1999/02/22-rdf-syntax-ns#type _:genid2147487982.
OBO STRUCTURE ERROR Ontology does not conform to OBO structure rules: multiple def tags not allowed. in frame:Frame(CLO:0000001 id( CLO:0000001)comment( A 'cell line cell' is a part of a cell line established through the passaging/selection of a primary cultured cells or the experimental modification of an existing cell line. New types of cell line cells are established after sufficient passaging of a primary culture to establish a stable and homogenous population that qualifies as a line (typically 1-20 passages), or following some spontaneous or experimental modification that confers novel characteristics to an existing line. A cell line cell typically has mutations of five or more genes compared to the original cell that derives the cell line cell. Some gene mutations may turn on some oncogenes. Cell line cells can be in active culture, stored in a quiescent state for future use (e.g. frozen in liquid nitrogen), or applied in experimental procedures. )name( cell line cell)property_value( IAO:0000111 cell line cell xsd:string)def( A cultured cell that is part of a cell line - a stable and homogeneous population of cells with a common biological origin and propagation history in culture)def( A cultured cell that is part of a cell line - a stable and homogeneous population of cells with a common biological origin and propagation history in culture )property_value( IAO:0000412 http://purl.obolibrary.org/obo/clo.owl)property_value( IAO:0000117 Yongqun He, Matthew Brush, Sirarat Sarntivijai, Alexander Diehl, Jie Zheng, Yu Lin, Bjoern Peters xsd:string)property_value( IAO:0000412 http://purl.obolibrary.org/obo/obi.owl)relationship( RO:0001000 CL:0000001)is_a( OBI:0001866{{is_inferred=true} })) For details see: http://robot.obolibrary.org/errors#obo-structure-error Use the -vvv option to show the stack trace. Use the --help option to see usage information.
Hey @lmanchon
ERROR Input ontology contains 3 triple(s) that could not be parsed:
Has nothing to do with OBO. Just ignore it #829
OBO structure:
- OBO goes loslessly into OWL, but not the other way around. For example, having two labels or definitions is illegal in OBO.
- That said, if you add
--check false
you may be lucky enough and ROBOT will create a broken OBO file for you (one with multiple definitions/etc. This may be ok for some use cases, but many strict OBO libraries may not be able to read an OBO file that is not OBO format conformant. - If you really have to convert an OWL ontology to legal OBO, you may have to massage the input, like deleting duplicate labels (difficult problem, I think you have to write a really advanced SPARQL query to do that).
You also asked elswhere where to get OBO files from? Many OBO Foundry ontologies provide OBO format outputs, but to be honest, most use --check false
. There is no guarantee they are standard conformant.
For downstream processing I recommend using ROBOT table generation processes: https://oboacademy.github.io/obook/tutorial/sparql-report-robot/
so nothing obvious for the owl --> obo conversion. it's a problem. The chado schema and the tripal module only recognize the obo format.
To be honest, I think this would be a great ROBOT feature. I struggle with this so much as well. I have been debugging dozens of OBO format violations.
it's a problem with all these file formats. there should be a unique standard format, why not the owl. And delete the obo format.
The OBO community requires all projects to publish an OWL file in RDF/XML format as their primary release product. All other products are optional. If you use OBO community projects, building your tools to use OWL in RDF/OWL format makes good sense, so why doesn't everybody do that?
-
A number of projects predate the OWL specification, and/or still rely on OBO-format toolchains.
-
OBO-format is pretty simple to read and write, and there are libraries to work with it in a number of different languages, including Java, Rust, and Python. On the other hand, the only thorough implementation of OWL that I'm aware of has been OWLAPI, which is limited to Java and the JVM. That is starting to change with
horned-owl
(Rust), and early work I've been pushing to encode OWL logic in JSON ("wiring") so it can be inserted into SQLldtab.clj
, and variations such assemantic-sql
.