[WIP] Add infrastructure for state serialization
This is the first crack at State serialization, which is not yet complete. This contains the necessary infrastructure to provide toXmlElement() and fromXmlElement() support for general types and (most important) for AbstractValue and Value<T>.
See TestXml.cpp for test cases forAbstractValue and Value<T>. State.toXmlElement() and State.fromXmlElement() exist and will work if all the contained AbstractValue types have those methods, or if they have stream extraction/insertion operators from which workable defaults an be created.
No State serialization tests are committed here yet. Those need to get written and debugged. @chrisdembia, if you want to work on this have a look and then we can discuss.
Fantastic! I'll start looking at this soon.
Unfortunately, std::is_null_pointer<T> is C++14, not C++11.
std::is_null_pointer<T> is C++14, not C++11
Removed check for now. Should pass on Linux and OSX now; still one mysterious test failure on Windows.
I investigated the Windows failure. It appears to be a problem with the Visual Studio linker not correctly generating a single copy of each Value<T> class so that its static members are shared by all. Hence it tries to re-register the deserialization method for Value<double> and triggers an error message. It would be easy to remove the check and ignore subsequent registration (in which case all the tests will pass), but I am worried about what will happen on deserialization because dynamic_cast may not recognize the two Value<double> objects as being the same type. I'll leave it for now as a reminder that further investigation is needed; there may be a workaround that will ensure the linker works properly.
Ah I know very little about that stuff, but would extern help?
Ah I know very little about that stuff, but would extern help?
Can't be extern because Value<T> is templatized and thus has to be defined in headers. The linker then has a nasty job to perform to ensure that every compilation unit that has instantiated Value<SomeType> for the same template argument uses the same static data members -- I'm depending on that to ensure that each type is registered only once for deserialization.
Maybe it is desirable to prefer a free function (that is, template specialization of toXmlElement()) over a member function) if both exist.
No, as explained in the Unified Call Syntax doc you linked to, the member function is to be preferred because it was provided by the class designer.
Drawing a parallel to std::begin() and std::end() ...
Coincidentally that is also discussed in the Unified Call Syntax doc, where it was mostly described as a hack for range-for; it is definitely not a general principle and it didn't seem that they were particularly happy with it. (That's Stroustrup and Sutter so I'm inclined to take them seriously!)
One potential issue with preferring the member function over a free function: say today there is a type X that I want to serialize but I don't have control over it, so I write a free function. Then next year, the author of X decides to add a toXmlElement() member function to that class. Now, the functionality has silently changed.
Now, the functionality has silently changed.
Hmmm ... any time someone changes the code of a class you are using the behavior can change! Hopefully the author took seriously the need to serialize the class and now you won't have to. Otherwise I recommend sending a nasty email!
I suppose that's true.
@klshrinidhi seemed to be able to get rid of the third argument!
One way to get rid of the name toXmlElementHelper is to rename toXmlElement to toXmlElementImpl, to signify this is the implementation of a function that does the conversion, freeing the use of the name toXmlElement to the interface.
I have started working on a TestStateXml test for Simbody, and will open a PR for that eventually.
Separately, I gave my first shot at writing out an OpenSim Model's State to XML. I got the following error:
26: Test failed due to exception: SimTK Exception thrown at String.h:353:
26: Error detected by Simbody method String::String(T): Type T=SBModelVars has no stream insertion operator<<(T) and there is no specialized String(T) constructor.
26: (Required condition '!"no stream insertion operator"' was not met.)
I think this means (1) SBModelVars does not have a toXmlElement() yet, and (2) the error for when toXmlElement() is not implemented for T does not indicate that the issue was a missing toXmlElement(). Actually, I don't understand how this was a run-time error instead of a compile-time one.
- It's complaining about the stream insertion operator because that is used by default to serialize when there is no toXmlElement() available. Actually it is more contorted than that because it first falls back to
writeUnformatted()which is how we getinfandnanto come out right among other things. It iswriteUnformatted()that falls back to stream insertion. (Links are to the relevant code.) - Not sure whether the above contortions are a good idea, but they do save writing a lot of redundant toXmlElement() methods. Better ideas?
- The reason it is a run time error is that there are many uses of
Value<T>that don't require serializing and I didn't want to make people provide serialization methods if they don't intend to serialize. The originalValue<T>did have this requirement (everyone had to provideoperator<<()) and it turned out to be a huge inconvenience and resulted in many indecipherable compilation errors.
One way to get rid of the name toXmlElementHelper is to rename toXmlElement to toXmlElementImpl, to signify this is the implementation of a function that does the conversion, freeing the use of the name toXmlElement to the interface.
I don't understand. toXmlElement() is the name in the interface at the moment. Please clarify.
Not sure whether the above contortions are a good idea, but they do save writing a lot of redundant toXmlElement() methods. Better ideas?
Thanks for explaining. I like idea of piggybacking off writeUnformatted() and the stream operator. I'm also fine with the run-time message (over a compile-time message). What about just catching such exceptions in the toXmlElement() function that calls writeUnformatted() to provide a more informative message?
I don't understand. toXmlElement() is the name in the interface at the moment. Please clarify.
I mean the interface for the implementor of toXmlElement() methods. I'm thinking of the following: The interface for converting an object to an XML would be SimTK::toXmlElement(obj) (a free method). It may or may not work, depending if obj supports serialization. For some objects that users are more likely to want to serialize, we provide a convenience member function, e.g. State::toXmlElement(). To provide the ability to serialize an object, you either implement toXmlElementImpl(const T&) or T::toXmlElementImpl().
auto xmlDoc = SimTK::toXmlElement(SimTK::State()); // this would work and call SimTK::toXmlElementImpl
Xml::Element SimTK::State::toXmlElement() {
return this->toXmlElementImpl();
}
Xml::Element SimTK::State::toXmlElementImpl() {
...
elem.appendNode(toXmlElement(...));
elem.appendNode(toXmlElement(...));
}
Let me know if I'm still not being clear. The gist is that if you want to obtain an Xml::Element, you call toXmlElement(). If you want to provide the ability to convert to Xml, you implement T::toXmlElementImpl(). The fuzzy area is member functions that you call to obtain an Xml::Element, since it's weird for a user to call a method toXmlElementImpl().
Alternatively, we could just rename toXmlElementHelper() to something that more clearly conveys the meaning findSomeWayToConvertThisThingToXmlForMe(). Like convertToXmlElement(), or createXmlElement() or formXmlElement().
I still like the idea of handling the potentially error-prone version and element name code in some central way. Here's a sketch of what I'm thinking:
class XmlSerialization {
public:
XmlSerialization(std::string type, int version, std::string attrName="");
template <typename T>
void appendThing(std::string name, T value);
Xml::Element getXmlElement() const;
};
...
Xml::Element Measure_Differentiate_Result::toXmlElement(const std::string& name) const {
static const int version = 1;
XmlSerialization xml("Measure_Differentiate_Result", version, name);
xml.appendThing("operand", operand);
xml.appendThing("operandDot", operandDot);
xml.appendThing("derivIsGood", derivIsGood);
return xml.getXmlElement();
I think there are a bunch of flaws with this proposal. For example, how would we require that people use the XmlSerialization class instead of writing their own serialization? Also, I haven't thought through how we would actually handle different versions, or how to combine this with the corresponding fromXmlElement() method. Also, perhaps the decentralized nature of the current serialization scheme is fine (where each class is in charge of doing it correctly). Anyway, it gets rid of some of the boilerplate.
It just sorta feels like the valuable information is "type name", "version", and a list of the members to serialize. It would be good if that were all one needed to specify.
Well, I'm more confused now! What I want to achieve is Xml::Element = state.toXmlElement() and state.fromXmlElement(element). Those are problematic because States contain AbstractValue objects, whose underlying Value<T> objects can be arbitrary existing types or user-defined types. That requires a user-written method for serializing type T that can be embedded in the serialization of Value<T>, and a corresponding user-written method that can be invoked when deserializing Value<T> objects. These should be member functions when possible (for general object-oriented code reasons), free functions otherwise.
I want those functions to be easy enough to write that there is a high probability of it being done successfully. Since user types will have members of arbitrary type (including other user types) the best way to do this would be recursively. So I'd like to be able to write a T.toXmlElement() method that invokes the TT.toXmlElement() methods of the data members of type TT. But that can't work exactly because we don't know if TT has a member, so I invoked the Helper there to figure out which method to invoke. @klshrinidhi's fix makes that more appealing since it gets rid of the mysterious third argument. It would be perfect I think if it could be named toXmlElement() so that a user writes the T.toXmlElement() method by calling a series of toXmlElement(TT) helper functions which map either to TT's member or a free function toXmlElement(TT). But I don't think that's possible because the helper and the free function would have the same name (is it?).
Anyway I could live with writing the member function or free function toXmlElement() in terms of toXmlElementHelper() with the final argument gone. I guess that's not so bad. Another alternative would to name the member function and helper function toXmlElement() but name the free functions toXmlElementFree() or something.
But I don't think that's possible because the helper and the free function would have the same name (is it?).
Even if it were possible, I think it might be confusing to use the same method name for so many things.
Anyway I could live with writing the member function or free function toXmlElement() in terms of toXmlElementHelper() with the final argument gone. I guess that's not so bad. Another alternative would to name the member function and helper function toXmlElement() but name the free functions toXmlElementFree() or something.
I think either of these are fine.
@sherm1 I am starting to add in the necessary toXmlElement() methods on other types. I was going to start working on SBInstanceVars but I'm not sure I actually know how to serialize this class. While some of the members seem like part of the state (mobilizerLockLevel, constraintIsDisabled), my gut says other members aren't actually part of the state, like bodyMassProperties.
class SBInstanceVars {
public:
Array_<MassProperties,MobilizedBodyIndex> bodyMassProperties;
Array_<Transform, MobilizedBodyIndex> outboardMobilizerFrames;
Array_<Transform, MobilizedBodyIndex> inboardMobilizerFrames;
Array_<Motion::Level, MobilizedBodyIndex> mobilizerLockLevel;
Vector lockedQs;
Vector lockedUs; // also used for udot
Array_<bool, MobilizedBodyIndex> prescribedMotionIsDisabled;
Vector particleMasses;
Array_<bool,ConstraintIndex> constraintIsDisabled;
Could you tell me which of these members should be serialized?
(SBInstanceVars) Could you tell me which of these members should be serialized?
It's a variable, so everything in it should be serialized. Body mass properties are instance variables, but there aren't currently any methods for setting them so they just keep their default values.
there aren't currently any methods for setting them so they just keep their default values.
They are all public members though.
They are all public members though
But in an internal struct -- can't be seen from the API. Serialization is agnostic about public and private since all data has to get written out in order to get it back later.
Okay thanks for clarifying.
For deserializing, the fromXmlElementHelper() calls a member function that takes an Xml element and fills in the this object from that Xml element. However, the member function for the PerSubsystemInfo class has an (unimplemented) fromXmlElement() function that is static and takes a PerSubsystemInfo& as an argument. Which do you think makes more sense for fromXmlElement() member functions? To fill in this, or to fill in a provided-by-ref object?
I think I debated that with myself and ended up deciding it was (slightly) better to fill in the this object. So probably the unimplemented method represents some earlier point in my self-debate but should be converted now. I'm not sure I had any great arguments either way, but I like the symmetry of being able to say thing.toXmlElement() and thing.fromXmlElement().
Okay; sounds good to me.
There is a small issue with the error message one gets when trying to deserialize a type that doesn't have a fromXmlElement() implementation. Since readUnformatted() and (tryConvertTo()) will be available for any T, then the fromXmlElement() that calls readUnformatted() will be used if fromXmlElement() doesn't exist for T. This is fine in many cases, since you check readOK and give an error.
template <class T> inline void
fromXmlElement(T& thing, Xml::Element& elt,
const std::string& requiredName="") {
if (!requiredName.empty()) {
const String& name = elt.getElementTag();
SimTK_ERRCHK2_ALWAYS(name==requiredName, "fromXmlElement<T>",
"Expected element name '%s' but got '%s'.", requiredName.c_str(),
name.c_str());
}
std::istringstream is(elt.getValue());
const bool readOK = readUnformatted(is, thing);
...
There is one case, though, in which this leads to a confusing error message, and it stems from requiredName having two different uses. Sometimes, the "requiredName" applies to the tag name (as above), and in other cases, it applies to an attributed named "name". So the issue arises when trying to deserialize a type whose "requiredName" is an attributed named "name", but fromXmlElement() is unimplemented and so the above generic fromXmlElement() gets invoked, and thinks that the requiredName should be the tag name. This situation results in the following error message (for PerSubsystemInfo:
Error detected by Simbody method fromXmlElement<T>: Expected element name '0' but got 'Subsystem'
The simplest solution to me sounds like having only one interpretation for requiredName; that is, as an attribute named "name." I think this is possible by simply making the generic toXmlElement() wrap another element around the output of writeUnformatted().
Yeah, that dual-use name thing is odd. A better alternative would be nice.