robot icon indicating copy to clipboard operation
robot copied to clipboard

Allow specifying input type in `robot convert`

Open cthoyt opened this issue 3 years ago • 9 comments

Right now, when I get an error from robot convert, I have to add -vvv to get logs to figure out what is broken in the ontology, but this includes all of the logs from all of the different parsers robot tried. It would be nice to have a --input-format flag that lets me specify exactly what format the input it is, so this iteration through every possible parser type can be avoided and I can more easily get to my relevant logs

E.g., I need to ctrl-f for org.semanticweb.owlapi.oboformat.OBOFormatOWLAPIParser every time I am trying to parse an OBO file, and the names are pretty hard to remember for each of these classes

cthoyt avatar Aug 05 '22 12:08 cthoyt

These parsing messages are a nightmare.. I agree, and I would even like to go a bit further and widdle down the error message in the stack trace a bit. I think this is a good idea @cthoyt and should not be too hard to implement.. I wont do it right now, but if pressure mounts on this ticket I might be convinced to do it.

matentzn avatar Aug 05 '22 14:08 matentzn

Another reason to prioritize: junk input such as the html we get back from a misconfigured purl is frequently valid input for some parser, so parsing silently passes producing an empty ontology when it should fail. A lot of people have been tripped up by this. If we were designing from scratch I would force specifying an input type and only iterate through all parsers if explicitly requested

On Fri, Aug 5, 2022 at 3:10 PM Nico Matentzoglu @.***> wrote:

These parsing messages are a nightmare.. I agree, and I would even like to go a bit further and widdle down the error message in the stack trace a bit. I think this is a good idea @cthoyt https://github.com/cthoyt and should not be too hard to implement.. I wont do it right now, but if pressure mounts on this ticket I might be convinced to do it.

— Reply to this email directly, view it on GitHub https://github.com/ontodev/robot/issues/1038#issuecomment-1206501037, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOIIPX5SXXOQXJFDKKLVXUOGBANCNFSM55V6P5AQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

cmungall avatar Aug 09 '22 16:08 cmungall

@cthoyt Can you please look at PR #1056 and let us know if it works for you?

jamesaoverton avatar Oct 01 '22 14:10 jamesaoverton

Indeed that PR looks like it will get the job done.

cthoyt avatar Oct 01 '22 14:10 cthoyt

@cthoyt: @balhoff discovered that the effect of limiting the parsers used is transitive to the imports. I didn't think of that, but it's just the way OWLAPI implements it. So is the feature still useful to you?

jamesaoverton avatar Dec 07 '22 15:12 jamesaoverton

@jamesaoverton does this mean if there's a mismatch then it will explode? I think it will still be useful to just get the base ontology information.

cthoyt avatar Dec 07 '22 15:12 cthoyt

My understanding is that if you ask for RDF/XML then only that parser will be used, for the ontology and all of its imports. So if an import is in Turtle (or whatever) that import will fail to parse.

jamesaoverton avatar Dec 07 '22 15:12 jamesaoverton

So if the import failing to parse doesn't cause the entire job to break, then this is fine for me.

cthoyt avatar Dec 07 '22 15:12 cthoyt

we definitely want this to fail fast - if O1 imports O2 and O2 is not parseable then loading O1 should fail

I think if you want to avoid import closures this needs to be configured separately but this seems like a separate ticket

cmungall avatar Dec 07 '22 19:12 cmungall