ginkgo
ginkgo copied to clipboard
Ensuring non-null arguments
The problem
It almost never makes sense to pass a null pointer to one of Ginkgo's methods. There's usually no way to continue and the best we can do is just throw an exception. However, we don't ensure this behavior anywhere yet, so passing a null pointer will result in a crash when we try to dereference it.
Usual solution
The proposed C++ way (Sutter's Mill, also suggested by @mhoemmen) is to use a reference (&
) instead of a pointer for objects that cannot be null (assuming we are not giving any onwership).
I must say I don't like this approach as it has some drawbacks (in the order of importance, so if you are convinced by statement n
, you can skip statements m > n
):
- How do you handle non-null objects when you do transfer ownership? The solution doesn't handle the case of non-null
uniqe_ptr
orshared_ptr
, just a plain pointer. - Plain pointer / reference, shared pointer, and unique pointer determine the ownership of the object, not the functionality of the underlying object. However, using a reference instead of a pointer changes the way we use the object (instead of
->
we need to start using.
). This non-uniformity makes it more difficult to write code that doesn't depend on the kind of ownership we have (and may be confusing for some inexperienced users).- The first example is the
gko::clone
utility proposed in #30. While we only need a single implementation, that internally callsobj->clone()
independent of what kind of ownership the object has, supporting references would require writing another version of the method that callsobj.clone()
instead. - Say I'm writing a complicated function:
And I realize that I might want to simplify the function and make two things reusable by separating them into functions, making "1st thing" a functionstd::unique_ptr<LinOp> x = /* something */; // do 1st thing with x (multiple statements, using x->something()) // do 2nd thing with x (multiple statement, using x->something())
void f1(LinOp &x)
, and the second thingvoid f2(LinOp &y)
. I didn't change the functionality, but I have to find and replace allx->something()
withx.something()
just because I want to assert thatx
cannot be null. - This is a revers situation than the previous example. Suppose I have a following function:
Then I figure out I have a bug, and that "thing 2" should work on the originalvoid f(LinOp &x) { // do thing 1 with x // do thing 2 with x }
x
, and "thing 1" should have a temporary copy of 'x'. I decide to addauto tmp = gko::clone(x)
, and usetmp
instead. Now I sure have to replace every occurrence ofx
withtmp
, but I also have to change how I call things ('.' with '->'), which can be a bit confusing.
- The first example is the
Proposed solution
Rather than mixing references/pointers just to express if the argument can be null, I would suggest adding a decorator that throws a runtime error if the object is null. Here is the pseudocode (that would probably have to be refined a bit to handle special cases, e.g. plain pointer):
template <typename Pointer>
class non_null {
public:
template <typename... Args>
non_null(Args...&& args) : p(std::forward<Args>(args)...) {
if (p == nullptr) {
throw std::runtime_error("Expected a non-null pointer");
}
}
auto get() const { p.get(); } // have to specialize this for plain pointer
auto operator *() const { return *p; }
auto operator ->() const { return p; }
operator Pointer&() const { return p; } // maybe explicit?
private:
mutable Pointer p;
};
We could also publicly inherit from Pointer
instead, to provide all the same methods as Ponter does (we have to be careful not to inherit from plain pointer).
Then we could use:
std::unique_ptr<LinOp> LinOp::generate(non_null<std::shared_ptr<const LinOp>> op);
void LinOp::apply(non_null<const LinOp *> b, non_null<LinOp *> x);
// currently don't have a function in Ginkgo that takes a unique pointer, so making it up:
void my_sink(non_null<std::unique_ptr<LinOp>> op);
I think it's pretty self documenting that all these expect a non-null pointer, maybe even more than by passing by reference, and we get a run-time check that the pointer is actually not null.
@gflegar wrote:
How do you handle non-null objects when you do transfer ownership?
Alas, there is no standard C++ library way of doing that at compile time. For doing this at run time, a class like your non_null
immediately came to mind.
I understand non_null
as alternative syntax for &
, to make the user experience consistent. This means it should be nonowning, which means it shouldn't hold a Pointer. I would prefer templating it on T
, just like shared_ptr
or unique_ptr
, rather than on the pointer type. It's a different kind of pointer. Behavior should be the same, regardless of the pointer type input.
I would prefer templating it on T, just like shared_ptr or unique_ptr, rather than on the pointer type. It's a different kind of pointer. Behavior should be the same, regardless of the pointer type input.
I'm not sure I understand. It shouldn't be a different kind of pointer. If it was, then
non_null<shared_ptr<T>>
would be a double pointer to T
, we don't want that, we want it to be a non-null pointer to T
. It's just a decorator around other pointers. We can achieve this either by composition or inheritance. Above is the composition version, here is the inheritance one:
template <typename Pointer, typename = void>
class non_null : public typename std::remove_cv<Pointer>::type {
public:
template <typename... Args>
non_null(Args...&& args) : Pointer(std::forward<Args>(args)...) {
if (*this == nullptr) {
throw std::runtime_error("Expected a non-null pointer");
}
}
};
// specialization for plain pointers (C++11, templates are uglier than in C++17)
template <typename Pointer>
class non_null<Pointer,
gko::void_t<typename std::enable_if<
std::is_pointer<Pointer>::value
>::type>> {
public:
using pointer = typename std::remove_cv<Pointer>::type;
using value_type = typename std::remove_pointer<Pointer>::type;
template <typename... Args>
non_null(Args...&& args): p(std::forward<Args>(args)...) {
if (p == nullptr) {
throw std::runtime_error("Expected a non-null pointer");
}
}
// implementations of get(), operator *() and operator ->()
}
private:
pointer p;
};
Both versions have similar interface, but I tend to like the inheritance one more. We do need a separate class specialization for plain pointers, but the version for smart pointers works like a charm, and propagates whatever interface Pointer
exposes.
There's of course a minor difference: a constant pointer (* const
) is expressed as non_null<const std::shared_ptr<T>>
in the first version, and as const non_null<std::shared_ptr<T>>
in the second version.
Alas, there is no standard C++ library way of doing that at compile time.
If you think about it, it's impossible to do at compile time anyway. You only know what the pointer points to at run time.
The void f(T&)
solution seems like it's doing it at copile time, but in fact, when you call it as f(*x)
, the *
operator is checking at runtime whether the pointer is null. References only delegate this to the "conversion" operator.
We could do the same here by not having the logic in the constructor (or just making it explicit), but instead have separate operators that convert Pointer
to non_null<Pointer>
and do the check. But that introduces an additional function to #30.
If you think about it, it's impossible to do at compile time anyway. You only know what the pointer points to at run time.
nullptr
is a constexpr, and new
(by default) must throw if allocation fails. Thus, in theory, anything that is not nullptr
and that comes from (default) new
, can be proven at compile time to be nonnull. In practice, though, you're right ;-) .
I favor use of references for nonnull input arguments, mainly because
- the function's signature explains the intent without further documentation ("must be nonnull"), and
- users who don't buy into your memory management wrapper can still use the function.
It's your library, though :-) . There's no serious performance issue here for us. We would end up using this library mostly for large sparse objects anyway.
How do you handle non-null objects when you do transfer ownership? The solution doesn't handle the case of non-null uniqe_ptr or shared_ptr, just a plain pointer.
I still don't see how references would help with this?
nullptr is a constexpr, and new (by default) must throw if allocation fails. Thus, in theory, anything that is not nullptr and that comes from (default) new, can be proven at compile time to be nonnull.
This is a good point. It's possible to add this compile-time checking to non_null
:
non_null(std::nullptr_t) = delete;
I must say I like keeping pointers. The decorator looks fairly good and easy enough to put in place without too many changes, in addition the last answer with the deleted constructor for std::nullptr_t
allows for compile time checking and errors which helps us.
It's maybe a personal preference or misunderstanding, but since all objects we use are dynamic memory I like to use pointers to represent them. Even though the memory is managed automatically, I think everyone (even people coming from other languages) get a good feeling for what happens behind them due to that. And mixing both uses is definitely clumsy.
@tcojean @hartwiganzt just want to remind you of this one. It's quite an important feature, which should not be that difficult to implement (there's a prototype implementation above), but constitutes quite a big interface change (true change, not an addition) from the user's perspective. This means that if we do not do it now, it has to wait until Ginkgo 2.0.0.
Somewhat related is also #179.