cpp-proposals-pub icon indicating copy to clipboard operation
cpp-proposals-pub copied to clipboard

mdarray: Add container_to_accessor customization point(s)

Open mhoemmen opened this issue 3 years ago • 8 comments

Problem: Conversion to mdspan assumes default_accessor

As of P1684R3, mdarray has two conversions to mdspan.

  • operator mdspan()
  • to_mdspan(const OtherAccessor&)

Both of these assume that default_accessor applies to the container, and that default_accessor is convertible to the returned mdspan type's accessor type. This is acceptable for the use case of viewing a contiguous container's memory. Users can view the mdarray in a particular context with a special accessor (that e.g., uses atomic_ref or restrict) by calling to_mdspan with an accessor of the desired type. However, this design has the following issues.

  1. It strips away any compile-time access restrictions that the container's operator[] might have.
  2. It only works if the result of the container's data() method is convertible to ElementType*, and represents an array of at least required_span_size() elements.

Regarding (1): An example is that the container might contain a GPU device allocation that is not accessible from normal host code. The container's operator[] might have annotations that make calling it from host code a compile-time error. Even if we have an mdspan "device_only_accessor" with the analogous feature, it's still possible with the current mdarray design to convert an mdarray to an mdspan with default_accessor. This would strip away the protection that the container offers.

Regarding (2): This is not the same syntactically as a "contiguous container," but is semantically close. It prevents use of containers to express things like remote memory allocations (in the PGAS sense: e.g., NVSHMEM or MPI 1-sided). We can express access to remote memory allocations and other non-raw-pointer-things with mdspan, but we can't with mdarray.

Solution: Container-to-accessor customization point(s)

Add to P1684 a container_to_accessor customization point that defines a custom mapping from container to accessor. mdarray's default mapping would use default_accessor, but users could override this for custom container types (i.e., container types not in the Standard Library). Any conversion of mdarray to mdspan would use this container-to-accessor mapping. operator mdspan() would use it to determine the container's accessor type, to see if it can be converted to the left-hand-side's accessor type. to_mdspan(const OtherAccessor& other) member function would use this in the same way (asking whether the container's accessor can be converted to other).

Container-to-accessor mapping is one to one

This design relies on a one-to-one container-to-accessor mapping. In particular, users would need a custom container to express "you can't access this memory on host"; they could not just use a custom allocator with std::vector. We think this is correct. For instance, users cannot change the compile-time behavior of std::vector::operator[] with a custom allocator.

Overloads for const and nonconst container reference

This is actually two customization points:

  • container_to_accessor(container_type&) for use when the mdarray is called in a nonconst context, and
  • container_to_accessor(const container_type&) for use when the mdarray is called in a const context.

The difference could matter for performance. For example, read-only access to remote memory could be cached without coherence. mdarray would use the appropriate overload based on context, so users would only need to define container_to_accessor(const container_type&) to cover both cases.

Customization point syntax

Here is our intent.

  1. Users may not override the container-to-accessor mapping for existing container types in the Standard Library.
  2. If users override the container-to-accessor mapping for a custom container type, then mdarray will use this overridden mapping. In that case, it is a compile-time error (Mandates) if the mapping is needed for nonconst mdarray, but the user only provides the mapping for const container_type&.
  3. Otherwise, if the container's .data() method is convertible to ElementType* (or const ElementType* if called in a const context), then the default container-to-accessor mapping is default_accessor<ElementType> (resp. default_accessor<const ElementType>).

This behaves like ranges::swap: a single function-ish thing for mdarray to call, that can dispatch either to user-defined behavior or default behavior. The question, then, is what syntax to use to express this. The phrase "customization point" strongly suggests [customization.point.object] (CPO), or at least the intent to inhibit argument-dependent lookup. WG21 hasn't given clear guidance yet on whether CPOs, tag_invoke, or some not-yet-ratified syntax is the approved way to implement customization points. It's not so important to us what syntax to use, as long as users can do something simple, like define functions container_to_accessor(container_type&) and container_to_accessor(const container_type&) for us to ADL-find.

Earlier versions of P1684 had contemplated a "ContainerPolicy" template parameter that would define both the container type, and its associated accessor type. This would prevent the need for a customization point. However, this design has several issues.

  1. It would complicate common use cases, for which the default container-to-accessor mapping suffices.
  2. It would make the name of the mdarray (with its template arguments) less transparent.
  3. It could give the incorrect impression that mdarray uses the accessor policy to access the container. mdarray always uses the container's operator[] directly; it does not access the container indirectly through an accessor.

An advantage of the customization point design is that if users never convert an mdarray to mdspan, then a container-to-accessor mapping need not even exist.

mhoemmen avatar Aug 09 '22 22:08 mhoemmen

Another advantage is that no implementation of the customization point is needed for the common case of container_type::data() returning element_type* .

crtrott avatar Aug 10 '22 06:08 crtrott

I'm curious what you think about the use case for when you have restrictions on your data access which may not have a one-to-one mapping to the underlying container.

Put in another way the one-to-one mapping says that all of the access properties of the container's data are encoded in it's type.

For argument's sake let's say the container is some wrapper around a shared_ptr like thing that doesn't encode the destructor as part of the type (so destructor could call free or cudaFree). As the programmer/at compile time we may know that that container is allocated on host or device memory, but the type doesn't say anything about it.

To me this seems like a case where we would like to "imbue additional semantics to what is otherwise an ordinary container".

Keryell & Falcou 2016 has some other examples where you may want to always access the data in a certain way that may not be encoded in a container's type. Some of these examples are dealt with using the LayoutPolicy.

Here is another possibly bad example (a type wrapper on top of mdarray might be better design). Let's say I have double data that lives in some coordinate system. The coordinate system does not have a one-to-one mapping with the underlying container type. But, the corresponding coordinate system should always be considered when trying to operate on the data since data in coordinate system A is not necessarily directly comparable to data in coordinate system B.

jacobmerson avatar Aug 11 '22 20:08 jacobmerson

@jacobmerson Excellent questions!

For argument's sake let's say the container is some wrapper around a shared_ptr like thing that doesn't encode the destructor as part of the type (so destructor could call free or cudaFree). As the programmer/at compile time we may know that that container is allocated on host or device memory, but the type doesn't say anything about it.

It feels like this would be better expressed through a container wrapper or "adapter" that encodes the memory space at compile time. In fact, I would argue that it's harder to write a container that "type-erases" the memory space, than a container that encodes the memory space in one or more types. Here is an example of a dynamically allocated array container that uses unique_ptr to store the array.

template<class T, class StaticAlloc, class StaticDealloc, class ExecutionPolicy = std::execution::sequential_policy>
class DynamicArray {
public:
  DynamicArray(std::unique_ptr<T, StaticDealloc>&& data, size_t the_size) : 
    data_(std::move(data)), size_(the_size)
  {
    std::uninitialized_value_construct_n(ExecutionPolicy{}, data_.get(), size_);
  }
  DynamicArray(const DynamicArray& rhs) : 
    data_(StaticAlloc{}(rhs.size_), StaticDealloc{}), size_(rhs.size_)
  {
    std::uninitialized_copy_n(ExecutionPolicy{}, rhs.get(), rhs.size_, data_.get());
  }
  DynamicArray& operator=(const DynamicArray& rhs) 
  {
    if(this != &rhs) {
      std::unique_ptr<T[], StaticDealloc> new_data(StaticAlloc{}(rhs.size_), StaticDealloc{});
      std::uninitialized_copy_n(ExecutionPolicy{}, rhs.get(), rhs.size_, new_data.get());
      std::swap(data_, new_data);
      size_ = rhs.size_;
    }
  }
  DynamicArray(DynamicArray&& rhs) : 
    data_(std::move(rhs.data_)), size_(rhs.size_)
  {}
  DynamicArray& operator=(DynamicArray&& rhs)
  {
    if(this != &rhs) {
      data_ = std::move(rhs.data_);
      size_ = rhs.size_;
    }
  }
  ~DynamicArray() = default;

  size_t size() const { return size_; }
  const T* data() const { return data_.get(); }
  T* data() { return data_.get(); }
  const T& operator[](size_t k) const { return data_[k]; }
  T& operator[](size_t k) { return data_[k]; }

private:
  std::unique_ptr<T[], StaticDealloc> data_;
  size_t size_ = 0;
};

I can "type-erase" this, for example by exploiting C++23 static operator() (see P1169).

struct A {
  static std::function<T*(size_t)> f_;
  static T* operator()(size_t N) { return f_(N); }
};
struct D {
  static std::function<T*(size_t)> f_;
  static T* operator()(size_t N) { return f_(N); }
};
A::f_ = &my_malloc;
D::f_ = &my_free;
DynamicArray<T, A, D> a(std::unique_ptr<int[], D>(A(42)));

Why would I do this, though? It's actually harder to type-erase the allocation and deallocation functions, than it is just to encode them in the type. C++ gives the right design cues: both unique_ptr and vector encode the (de)allocator into the type as a template parameter.

If somebody gave me a hypothetical TypeErasedDynamicArray, I would use it in mdarray by wrapping it in a way that re-encodes the allocator. That would again make the mapping from container to accessor one to one.

Let's say I have double data that lives in some coordinate system. The coordinate system does not have a one-to-one mapping with the underlying container type.

It feels like this would be better expressed through the type of the elements (the data), rather than through a one-to-many mapping of container type to accessor types. Containers of incompatible element types should have different container types.

mhoemmen avatar Aug 11 '22 20:08 mhoemmen

I had a discussion about this with colleagues today. There are three ways one could resolve this issue.

  1. container-to-accessor mapping customization point (originally proposed here)
  2. Template mdarray on accessor, and actually use accessor for access
  3. Template mdarray on accessor, and only use accessor for conversion to mdspan

We've discussed (1) above. Here are some arguments for either (2) or (3).

a. It would permit use of different accessors with the same container, analogously to how mdspan can have different accessors (e.g., aligned or not) with the same data_handle_type. This would solve the problem of containers with "type-erased allocation" (see the above discussion). b. A separate customization point would add complexity. Making the accessor a template parameter of mdarray would make it more obvious how to customize conversion to mdspan.

The main issue with (2) is that the container already has operator[] (it's a container!), so the accessor is superfluous. However, supplying an accessor that does the following:

return accessor_.access(std::advance(container_.begin(), offset));

would solve the problem of the definition of "contiguous container" (which is defined by the properties of its iterators, not by having a .data() method). It would also work for containers without operator[] (even std::list, for example). It would increase the inlining depth, but implementations could skip over the accessor in the default case, via the as-if rule. The mdarray would need to store the accessor, but could use [[no_unique_address]] for accessors without state.

There are a few issues with (3). First, users might be surprised that their Accessor template argument is only used for conversion to mdspan, not actually for mdarray::operator[]. Second, the argument would never be used if the mdarray is never converted to mdspan.

Here are some other approaches.

  • Make mdarray a container, instead of a container adapter. Its Container template parameter could then be replaced with Allocator and (an optional) Accessor.
  • Forget this whole thing. If you want a container with custom accessor, write your own.

That last option is attractive, because we've already standardized mdspan as the view vocabulary type. Different containers can interact through this common view type. Thus, it's not so bad if container types proliferate. Users of "small" container types may want "small container features" for the rank-1 and rank-2 cases, like swizzling and overloaded arithmetic operators. If we just let them have their existing classes, we won't have to try to satisfy all possible use cases.

mhoemmen avatar Aug 12 '22 00:08 mhoemmen

Thanks for the writeup and very detailed response. I agree on pretty much all of the points you bring up.

Like you say, the type erased allocator/deallocator case seems a bit far fetched that someone would go through the trouble. I haven't seen this in the wild.

I like the symmetry to mdspan that option 2 provides but like you say this adds much complexity and may have a performance impact. Option 3 seems like a tight coupling of mdarray to mdspan and doesn't seem great.

Option 1 (proposed here) seems reasonable if there is a high degree of certainty that existing containers types are enough to encode (with the 1-to-1 map) all access properties that would be wanted when viewing the data. Or that it seems reasonable to create new container types for any that don't currently have that 1-to-1 map.

Keep up the great work! I'm super excited about having reasonable multidimensional array support in C++.

jacobmerson avatar Aug 12 '22 04:08 jacobmerson

@jacobmerson Thanks for the feedback! : - ) An important thing to remember is that mdspan can view the elements of any multidimensional container type we could imagine. mdspan is the common vocabulary type. Thus, it's not so bad if mdarray doesn't satisfy all possible use cases. We'll have to trade off between convenience, simplicity, and generality.

mhoemmen avatar Aug 12 '22 14:08 mhoemmen

My thinking always was that the vast majority of accesses happen via mdspan not mdarray. You use mdarray to create the original data, but all your functions actually take mdspan. I don't see real value in accessor_policy on mdarray because its immutable, and accessor_policy was always meant as the "adapt the way you access existing data stuff". If we give mdarray both accessor_policy and container, then we need to make sure these things actually are meant to go with each other. For example the initial motivator to add this "I by default will return a custom policy" was to NOT type-erase access restrictions of the underlying allocator with respect to device/host. But that means there anyway is already a tight coupling of accessor policy and container type. So having both as template parameters introduces non-orthogonal things, you need to make them fit with each other in potentially subtle ways. Not having container customizable is kinda shitty since we want to support at a minimum both the "std::array" and "std::vector" like underlying storage. For all these reasons I am either for option 1, or for keeping the status quo: i.e. you must be able to get a default_accessor out of it, and you can request another accessor, but it goes via the intermediate step of default_accessor ...

crtrott avatar Aug 12 '22 17:08 crtrott

@crtrott wrote:

... there anyway is already a tight coupling of accessor policy and container type. So having both as template parameters introduces non-orthogonal things, you need to make them fit with each other in potentially subtle ways.

I agree -- this is a critical point.

As long as mdarray is a container adapter, I would argue for using the container's operator[] to access its elements. Containers come with two access methods.

  1. The container's operator[] (where access would do return container_[offset])
  2. Iterators (or pointers, a special case) (where access would do *std::advance(container_.begin(), offset))

Iterators (if they are at least forward iterators) form a view, so they would be more suited for mdspan. (By comparison, the Standard Algorithms operate on iterators, not containers.) Using iterators to access the mdarray's elements would suggest making mdarray a "container plus an mdspan that views it" (a design we had considered). However, that would pessimize the small container case. The main use cases for mdarray function parameters are small mdarrays.

mhoemmen avatar Aug 14 '22 05:08 mhoemmen