R3BRoot icon indicating copy to clipboard operation
R3BRoot copied to clipboard

Design Guidelines for R3BRoot

Open klenze opened this issue 1 year ago • 1 comments

I have recently read new code meant for inclusion in R3BRoot which violates what I consider important parts of the C++ core guidelines, like C.45 or C.2.

It is clear that we can not fully follow the guidelines everywhere, for example, "NR.5: Don’t use two-phase initialization" is already violated by FairSoft (if one assumes that three-phase initializations are also against the spirit of the guidelines), so there is little we can do to avoid them.

I would appreciate it if we could have a discussion about the merits of the C++CG, the "ROOT way" and various other schools of software design, with the desired outcome being a (somewhat) binding standard, for example, "Unless noted below, we follow the C++CG. Instead of CG XYZ.23, we prefer to do ... . ..."

Let me start this discussion with the following proposals: C.45: Don’t define a default constructor that only initializes data members; use in-class member initializers instead.

I prefer this over the R3BRoot style of having default constructors with endless initializer lists for the following reasons:

  • The reader of the header can be sure that every member variable will have a defined value for any constructor
  • The initialization order is automatic
  • It is shorter (especially if both the default constructor and the normal constructor set some pointer to nullptr).
  • The reader can quickly check that the default values are appropriate for the type.

--

Speaking of initialization, appropriate invalid values are:

  • for signed integer types: std::limits<T>::min() (Unless possibly valid)
  • for unsigned integers: std::limits<T>::max() (Unless possibly valid)
  • Pointers: nullptr
  • for floating point values: NAN (It is ok to mark an overflow value as infinity)

--

C2: Use class if the class has an invariant; use struct if the data members can vary independently

This also seems straightforward to me. Virtually none of our data classes have invariants, so they should be structs.

--

If we must have getters and setters, declare them in-line. Use auto as the return type for getters.

--

It is permissible to have non-trivial functionality in data classes/structs.

For example, rather than TofD calculation auto bot_tot=fmod(bot->GetTimeTrailing_ns() - bot->GetTimeLeading_ns() + c_range_ns, c_range_ns); appears in many places and should be moved to R3BTofdCalData::GetTot().

Or for an R3BCalifaClusterData, it would be preferable to have a method XYZVector GetPositionSmeared().

The fact that the data decays into a plain tuple of values when we open the root file without R3BRoot definitions should not stop us from treating it as a proper C++ object before.

--

In particular, this means that data constructors can do nontrivial stuff.

So instead of having a constructor which takes six arguments, the order of which nobody will ever remember, pass to the constructor the relevant const pointers/references, and let it query the objects and do its calculations on its own.

If you absolutely can not do that, consider only setting the channel id in the constructor and setting the other members after the object is constructed.

--

It is permissible to store redundant information to increase convenience. For example, the position of a cluster in califa is calculated as the position of the crystal with the highest energy within that cluster, so it could be fully described by the crystal id of that crystal. In practice, going from crystal id to XYZVector is painful, thus it is acceptable to redundantly store the position in R3BCalifaClusterData.

Likewise, fetching the trigger time for an R3BTofdCalData entry involves looking up the trigger id of that channel in a map and then looking in another TClonesArray. I would argue for just putting another double value into the class.

--

In R3BRoot, do not interface plain TClonesArray's unless you are writing generic code for interfacing TClonesArrays. Furthermore, having a task specific AddData method which takes the six arguments which it will pass on to the constructor is a lot of fragile interface with very little functionality.

--

Using store=!fOnline is terrible. If your class was instantiated, assume that the user is interested in the output. It might be ok to have the ability to set fTransientOutput to allow the user to specify that it only wants the output of your task for feeding another task.

"Online" instances should use a FairSink which works like /dev/null. (Setting /dev/null as the output file works until root helpfully decides that 100GB is big enough and switches to the next file.)

It would be preferable to have the fTransientOutput, SetTransientOutput() being part of FairTask.

--

TClonesArrays should be replaced by STL types as soon as possible.

--

To be continued.

klenze avatar Mar 31 '23 20:03 klenze