sofa icon indicating copy to clipboard operation
sofa copied to clipboard

[all] Remove pointer to itself in initData

Open alxbilger opened this issue 2 years ago • 6 comments

I wondered why it is necessary to pass the pointer to the Data to initialize it in the initData function. I tried to remove it, and nothing bad seems to happen. So, why?

Of course, the initData mechanism is questionable. But this is not the scope of this PR.

If it appears that passing the pointer is not necessary, this PR will have to add a compatibility mechanism and perhaps a depreciation.

The files to reviews are mainly:

  • BaseClass
  • BaseData
  • Base

By submitting this pull request, I acknowledge that
I have read, understand, and agree SOFA Developer Certificate of Origin (DCO).


Reviewers will merge this pull-request only if

  • it builds with SUCCESS for all platforms on the CI.
  • it does not generate new warnings.
  • it does not generate new unit test failures.
  • it does not generate new scene test failures.
  • it does not break API compatibility.
  • it is more than 1 week old (or has fast-merge label).

alxbilger avatar Jul 25 '22 11:07 alxbilger

[ci-build]

alxbilger avatar Jul 25 '22 11:07 alxbilger

maybe a historical explanation? @JeremieA @damienmarchal

hugtalbot avatar Jul 27 '22 09:07 hugtalbot

[ci-build]

alxbilger avatar Jul 27 '22 09:07 alxbilger

For historical explanations:

  • we need to gather the pointers to all Data in a Base instance, so there must be a function somewhere that is called with both the pointer to the Base* object and to the Data*.
  • a simple option could be to have the Base* pointer as a parameter of the Data constructor, however most compilers issue a warning if this is explicitly used when initializing member variables, as it is not fully constructed yet
  • however, compilers are happy with calling a method, which itself use this, hence the initial implementation of initData() (which was called dataField()) had the this pointer implicitely, but needed the Data* pointer to add if to the "fields" map, which is why it was provided as an explicit parameter to the method
  • later changes refactored this to add new features (such capturing which level of the class hierarchy is creating each Data), and the actual registration of the Data* in the list within Base was moved to the Data constructor, where the Data* pointer is available so the one from initData() became redundant. The newer similar design for initializing BaseLink did use an initLink() method without this pointer.

And regarding this change:

  • a useful aspect of having the pointer to the Data as the first parameter of initData() is to unambiguously know the type of the final Data<T>, so that there could be (relatively) good compile errors if the type of the provided default value is not matching it. I'm not sure how this PR is affecting / handling that (i.e. if you set a float value as a default to a Data<int>, how the gcc/clang/msvc error messages look like?)
  • one drawback in the current design is that there are sometimes bugs introduced by the multiple overloads of initData(), because of the ambiguity between the optional default value and the optional pair of bool flags at the end. In particular, for a Data<std::string>, calling initData(&d, "mydefaultvalue", "mydata", "help message") will incorrectly use the initData() version without default value but with one boolean flag, as "mydefaultvalue" is a const char* which requires a "user-defined" conversion to std::string, while "help message" is implicitely convertible to bool without any "user-defined" conversion, so preferred by the compiler. So if we break the compatibility to the current initData() API, I would strongly suggest to remove this issue, by removing the two optional bool flags which are seldom used anyway (and can be replaced by a call to setDisplayed() or setReadOnly()).
  • compared to when this code was originally written, c++ now allows to put initializers of member variables in the class declaration directly, which for Data members could be preferable to initializing them in the constructor (the name and help message would be visible in the header instead of the implementation, it does not need to be duplicated in case there are multiple constructors, and there would no longer be any unhelpful warnings on some compilers about mismatched initialization orderings). I'm not suggesting to do that in all the components in this PR (as the diff would be horible to review), but any new designs of initData should check that this is possible (last time I looked at that, gcc and msvc did not agree on if we needed to write Data<> d = initData(...) versus Data<> d { initData(...) } versus Data<> d = { initData(...) }, but it was a long time ago...)

JeremieA avatar Jul 27 '22 14:07 JeremieA

Thanks for the explanation @JeremieA.

For the record, this PR was just a preamble to open the discussion and was not totally finished.

One thing I noticed when preparing this PR is the ambiguities that you mentioned. As you suggested, my plan was to remove the optional parameters (or even implement a builder pattern). For the ambiguity on the type, I would like to experiment different things.

A few months ago, I suggested the initialization that you mentioned (https://github.com/sofa-framework/sofa/pull/2613), but it has been decided to continue working to find a better solution.

I'll continue to think about it.

alxbilger avatar Jul 27 '22 18:07 alxbilger

Hi all,

Thanks for the discussion.

My current understanding is that removing the whole InitData Mechanisme is:

  • possible,
  • desirable (would make the code more clear and maintainable)

This mean converting every

myData(initData(&myData, 0.1, "myData", "helpe)) 
/// into 
myData(this, 0.1, "myData", "helpe");  

About the use of "this" in a child constructor. My understanding is that it is considered safe for non virutal methods and attributes from the superclass (https://isocpp.org/wiki/faq/ctors#using-this-in-ctors)

I also tried a pure constructor based initialization with g++ and clang++ and got no warnings with -Wall and -pedantic.

About the ambiguity case cause by the char* to bool implicit conversion... I would say it would be much easier to fix when the initData machinary will be dropped.

EDIT: One more thing about the bool to specify the parameter... I think that using directly the enum 'named" value or the setXXXX method to set the parameter is far better than using bool.

damienmarchal avatar Sep 01 '22 14:09 damienmarchal