alpaka icon indicating copy to clipboard operation
alpaka copied to clipboard

Are alpaka objects copy/moveable?

Open bernhardmgruber opened this issue 2 years ago • 9 comments

During a PR review (#1564) I saw that a change might affect whether the OpenACC device class is copy/movable, with @jkelling stating that there might not be any tests for that. There is a claim in the documentation that buffer objects are reference counted, but I could not find anything about other kinds of objects (platforms, devices, queues, etc.).

Question: are all alpaka objects supposed to be reference counted and freely copy/movable?

Independent of the answer, we should document that more clearly and add tests for it.

bernhardmgruber avatar Jan 11 '22 19:01 bernhardmgruber

I would like to clarify (because I always misunderstand when @bernhardmgruber puts it like he did here): The referenced issue affects copy/move assignment only, not copy/move in general. I do not know if copy/move is tested somewhere, but assignment of at least Dev* types is clearly not.

jkelling avatar Jan 11 '22 20:01 jkelling

I don't see a reason for our objects to not be move-assignable. The semantics are clear here IMO.

Copy-assignable is a different matter. We should probably take the time and define the behaviour clearly if we haven't done this so far. For a device and a platform the behaviour is IMO clear, but what about a queue? Does a copy create a new queue with the same properties or just refer to the old queue?

j-stephan avatar Jan 12 '22 14:01 j-stephan

So far I understood or assumed that copying a queue, event, buffer, etc. gives an object that references the same queue, event, buffer, etc. as the original one.

fwyzard avatar Jan 13 '22 17:01 fwyzard

My initial goal was that the objects on the public interface are basically handles to the underlying resource owned by the lib/framework. Internally they hold reference counted implementations or in the case of CUDA sometimes also only handles to the CUDA objects. In the end this means that all alpaka objects passed out to the user can be copy/move-consructed/assigned by doing a plain copy. As result of a copy, there will not be any new resource. The ref-counted impls should already be non-copyable. At least this once was the intention.

We could also switch from this handle-semantic to a value-semantic where copies would in most cases be forbidden (queues, buffers, devices, ...). In this case, we could think about if we would:

  1. still give shared_ptr's to the user, which would be very similar to the current semantic, but makes it much more explicit that the user gets handles
  2. or if we directly give the plain objects to the user, who would then be responsible to move them into their preferred memory management tool or store them as plain objects

BenjaminW3 avatar Jan 23 '22 10:01 BenjaminW3

I prefer to stay with the originally intended way of shallow copies. We "just" need to document this better. As part of our documentation rework we could maybe come up with something like an alpaka specification where we clearly define these fundamentals; right now this is more like an oral tradition.

j-stephan avatar Jan 25 '22 11:01 j-stephan

A related topic that I found unclear is which interfaces are (supposed to be) const. For example, is submitting work to a queue a const operation for the queue object ?

fwyzard avatar Jan 25 '22 11:01 fwyzard

Out of my head, const methods should be:

  • side-effect free - should not lead to any externally visible state change
  • atomic or internally synchronized - should be safe to be called from multiple threads in parallel

For example, is submitting work to a queue a const operation for the queue object ?

Given my first point, I would say no, because you can most likely see that the queue length changed.

BenjaminW3 avatar Jan 25 '22 19:01 BenjaminW3

OK, I agree that this definition makes sense.

So, this correctly fails to compile:

template <typename TQueue, typename TBuf>
void test(TQueue const& queue, TBuf const& buf) {
  alpaka::memset(queue, buf, 0x00);
}

However, alpaka reference-counted objects can easily work around their const-ness, and it's pretty easy to write functions that take const arguments and do non-const operations:

template <typename TQueue, typename TBuf>
void test(TQueue const& queue, TBuf const& buf) {
  auto non_const_queue = queue;
  auto non_const_buf = buf;
  alpaka::memset(non_const_queue, non_const_buf, 0x00);
}

fwyzard avatar Jan 26 '22 06:01 fwyzard

However, alpaka reference-counted objects can easily work around their const-ness, and it's pretty easy to write functions that take const arguments and do non-const operations:

Good point. This is inevitable with the implicitly shared semantics, unless one is not allowed to copy, contradicting the inital premise.

This appears to be an argument for having all user-facing member functions of implicitly shared objects be const, which never requires a const_cast or something as their have a pointer to a non-const shared object. One could additionally implement a const view for each such type which only implements the part of the interface which are actually not modifying the shared object.

jkelling avatar Jan 26 '22 10:01 jkelling