dart
dart copied to clipboard
Proposal: Use YAML for .skel files
This would be a huge change, so I don't necessarily expect it to gain traction, but I think it would make everyone's lives better to use YAML instead of XML for .skel files. There are a few reasons I believe this:
(1) We use an awful lot of machinery just to parse XML files. In addition to using TinyXML2, we also have a ton of functions for interpreting the content of each element. But YAML has a C++ library called yaml-cpp which has a very nice API and which would not require any extra machinery on our end to read in everything we need. This library is cross-platform. I've already been using YAML and this library to construct and save things like Eigen data structures, so I'm confident that it would serve our needs.
(2) It is common for users to manually write, read, and manipulate robot model files. XML is a terrible and ugly format which only barely passes as human-readable. YAML on the other hand is extremely human readable. I think this would be especially advantageous since our skel format is a custom format of our own creation, so there does not exist any support right now for automatically generating .skel files, meaning everyone currently needs to generate the files by hand.
Another -- perhaps more reasonable -- option would be to deprecate the .skel format and come up with a new name for files that describe DART Worlds and Skeletons. This would allow us to maintain backwards compatibility.
I don't use .skel files, but I have thought for a while that if I were to create a new format, I would certainly use YAML. Users usually have experience with it (or with the related JSON) and it is rather nice to read. As Grey pointed out, the libraries available can just be included in the DART shipped code, so no dependency would be added.
PS.- Disclaimer: I am rather fond of .json, which also has a jsoncpp C++ Open Source library, so my opinion is surely biased.
I am a fan of YAML and already yaml-cpp extensively in my own code. I think it's generally a great idea.
However, I do have a few reservations:
- YAML is complicated. YAML supports two types of tags, aliases, the merge key construct, multiple documents in one file, and all sorts of other goodies. Different parsing libraries handle different subsets of these features and implement them in subtly different ways. I've even run into issues with two versions of
yaml-cpptreating local tags differently. - Serialization is tricky. Because of this complexity, loading a file and immediately serializing it back to YAML often dramatically changes the appearance of the file. The output is often needlessly verbose because the emitter has no good way of choosing between block and flowed elements, e.g. it will happily print every element of a matrix on a new line.
yaml-cppAPI incompatibilities. Much of theyaml-cppAPI changed between version 0.3 and 0.5. It's not, in general, possible to write code that works in both versions of the library. It's not possible to install both versions at once on Ubuntu becauselibyamlcpp-dev(version 0.5) conflicts withlibyaml-cpp0.3-dev(version 0.3). This is quite a mess because parts of ROS still require the 0.3 API.
JSON is very simple and does not have any of these issues. However, I think it is a poor choice for .skel files because, as @mxgrey said, they are typically written by hand. JSON is very tedious to write by hand because, e.g. it forbids trailing commas and has no comments.
If our concern is only the machinery code to parse SKEL file with TinyXml2 then we could consider using Boost.PropertyTree. It provides similar API as yaml-cpp.
In terms of file format itself, definitely YAML looks neater than XML especially when we need to write a file with those formats. I don't have an objection to using YAML if we are fine with the reservations @mkoval pointed out.
JSON seems not a good choice with same reasons above to me as well.
@mkoval has brought up a lot of really good points to consider, and I've been giving them some thought.
- YAML is complicated
Correct me if I'm wrong, but to me it looks to me like YAML has the potential for being complicated, but we can choose to use it in a simple way. To be honest, the things that you linked look mostly like jibberish to me at the moment, which does support your point about YAML being complicated. Are these all features that we necessarily need to be concerned about, or can we just choose to not use any of it?
- Serialization is tricky.
I imagine we can write a simple templated function for Eigen structures that will dump the contents of an arbitrary Eigen structure into YAML in a clean way. All of the data that would warrant careful formatting would be Eigen structures anyway, so that single templated function ought to be able to cover most of our formatting needs.
I haven't actually looked into implementing this yet, so let me know if it might not be as straightforward as I'm imagining.
- yaml-cpp API incompatibilities.
This is probably the most concerning thing to me. API versioning and compatibility issues can be a dealbreaker. However, it looks like the yaml-cpp license is very open, and would permit us to include a copy in the DART repo. We could hide it behind a dart::yaml namespace to avoid namespace collisions. I could understand if someone objects to this idea on the grounds that it would be overkill, but it seems preferable to the potential nightmare of API/ABI incompatibilities.
I don't know of any other markup formats that would allow a user to easily "script" a skeleton by hand, and I think YAML would be extremely well suited for this. But if anyone has any other suggestions, I'm definitely open to it. XML has always seemed like a rather ugly thing to have to handle manually, and I'd very much like to replace it if a better alternative exists.
Correct me if I'm wrong, but to me it looks to me like YAML has the potential for being complicated, but we can choose to use it in a simple way. To be honest, the things that you linked look mostly like jibberish to me at the moment, which does support your point about YAML being complicated. Are these all features that we necessarily need to be concerned about, or can we just choose to not use any of it?
For better or for worse, all of these features are part of the YAML spec. They're mostly handled by the YAML library (e.g. yaml-cpp), but they occasionally leak through. For example, YAML files may contain cycles. This complicates parsing, since a naïve parser may get stuck in an infinite loop.
I imagine we can write a simple templated function for Eigen structures that will dump the contents of an arbitrary Eigen structure into YAML in a clean way. All of the data that would warrant careful formatting would be Eigen structures anyway, so that single templated function ought to be able to cover most of our formatting needs.
I am not sure how to tell recent of yaml-cpp(> 0.5) to switch between block and flow style. It appears that the latest master added a SetStyle method to YAML::Node to enable this. However, this function does not exist in the version of yaml-cpp shipped with Ubuntu 14.04 and is labeled with: "WARNING: This API might change in future releases."
Setting block vs flow has a huge effect on readability. Suppose you have a 3 x 3 identity matrix. I would write the matrix in YAML like this:
- [1, 0, 0]
- [0, 1, 0]
- [0, 0, 1]
In that case, the outer list is block and the inner lists are flowed. This is what it looks like with everything flowed:
[[1, 0, 0], [0, 1, 0], [0, 0, 1]]
And, even worse, this is what it looks like with everything block. I believe this is the default in yaml-cpp:
-
- 1
- 0
- 0
-
- 0
- 1
- 0
-
- 0
- 0
- 1
yaml-cpp API incompatibilities.
This is actually less of an issue than I realized. There was a major API change in yaml-cpp between versions < 0.5, which was the default in Ubuntu 12.04, and > 0.5, which is the default in Ubuntu 14.04. The API should be relatively stable now.
This is a moot point. It's nearly impossible (from my experience) to build DART on Ubuntu 12.04, anyway: it requires installing an updated version of GCC (for C++11 support) and several system libraries (e.g. Assimp, CCD, FCL). It seems safe to assume, at least on Linux, that the user will have access to yaml-cpp >= 0.5.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.