oneDPL icon indicating copy to clipboard operation
oneDPL copied to clipboard

Removing reliance on copy assignment iterator for zip_iterator::operator+=

Open danhoeflinger opened this issue 1 year ago • 6 comments

zip_iterator::operator+= currently relies on the copy assignment operator of its zipped iterator elements.
This PR shifts that to instead rely upon the same operator+= of the zipped iterator elements as described by the specification:

"The arithmetic operators of zip_iterator update the source iterators of a zip_iterator instance as though the operation were applied to each of these iterators."

In order to maintain support for wrapping a sycl_iterator with a zip_iterator, this PR also adds the following operators to sycl_iterator: ++, --, ++(int), --(int), +=, -=, >, <=, >= . At least += is required to be able to use sycl_iterator within a zip_iterator with the operator+=. It makes sense to add all of these operators together.

The PR also augments the description of the "unspecified type" in the documentation to reflect its operator support. The intention would be to also adjust the oneDPL specification accordingly for this "unspecified type".


Additional context / motivation: This change is important because we are considering changes in the future which may impact the copy-assignability of some transform_iterators.

If we remove transform_iterator custom copy assignment operator in the future, it will rely on implicitly defined copy assignment (which will be deleted for c++17 lambdas). This means that if we make that future change, an iterator of type zip_iterator<transform_iterator<it,lambda>> would result in a build error if its operator+= was used due to this lack of copy assignment operator for the transform_iterator<it,lambda>. This PR removes that potential build error.

danhoeflinger avatar Mar 07 '24 18:03 danhoeflinger

These additions LGTM. Do we need a documentation update as well listing the newly supported operators in sycl_iterator?

mmichel11 avatar Mar 31 '24 15:03 mmichel11

These additions LGTM. Do we need a documentation update as well listing the newly supported operators in sycl_iterator?

Yes, we can update the guide with this on the supported operations provided by the "unspecified type".

danhoeflinger avatar Apr 01 '24 02:04 danhoeflinger

@mmichel11 The operator support of the "unspecified type" is actually defined by the specification, so if we are updating the guide, we also need to look at updating the specification.

Also according to the oneDPL specification: "The arithmetic operators of zip_iterator update the source iterators of a zip_iterator instance as though the operation were applied to each of these iterators."

Prior to this PR, our implementation was not following the spec, because instead of using operator+= of the source iterators to implement zip_iterator::operator+=, we are using copy assignment and operator+(difference_type). My guess that this was done to accommodate sycl_iterator as a source iterator, as zip_iterator is used within oneDPL for some implementations (and can therefore wrap a "unspecified type").

I propose we update both the specification and guide to add support by the "unspecified type" of the following operators: ++, --, ++(int), --(int), +=, -=, <, >, <=, >=.

It seems that in practice we already support <, even though it is not in the specification.

Tagging @akukanov @rarutyun for their thoughts on this.

danhoeflinger avatar Apr 01 '24 18:04 danhoeflinger

I looked through the implementations of the new operators and they seem good to me.

Since in practice the sycl_iterator class provided operator< without it being mentioned in the spec, is it necessary to update the spec to include the new operators introduced by this PR? That is, would there be a problem with extending the functionality of the "undefined type" without explicitly requiring those operators to be present in the spec?

Perhaps the line you found in the spec ("The arithmetic operators of zip_iterator update the source iterators of a zip_iterator instance as though the operation were applied to each of these iterators") implies that the "undefined type" has at least the same operators as zip_iterator. Or maybe it is meant to mean that if the source iterator has an arithmetic operator defined (e.g., operator+=), then the zip_iterator will call the source iterator's arithmetic operator. It is a little unclear to me.

adamfidel avatar Apr 01 '24 20:04 adamfidel

@adamfidel Thanks for taking a look.

Since in practice the sycl_iterator class provided operator< without it being mentioned in the spec, is it necessary to update the spec to include the new operators introduced by this PR? That is, would there be a problem with extending the functionality of the "undefined type" without explicitly requiring those operators to be present in the spec?

Its a good question... and I had a similar thought. The reason I think we should add it to the specification is because we depend on that functionality being there for the "unspecified type" by the combination of (1) using zip_iterator to wrap input data in our implementation for some algorithms in oneDPL, and (2) (after the PR) using those operators for source iterators of a zip_iterator.

I'm not sure its necessarily required, because its unlikely that someone would mix part of this implementation of the oneDPL specification with another implementation of oneDPL.

Perhaps the line you found in the spec ("The arithmetic operators of zip_iterator update the source iterators of a zip_iterator instance as though the operation were applied to each of these iterators") implies that the "undefined type" has at least the same operators as zip_iterator. Or maybe it is meant to mean that if the source iterator has an arithmetic operator defined (e.g., operator+=), then the zip_iterator will call the source iterator's arithmetic operator. It is a little unclear to me.

The line I quote from the spec about zip_iterator probably should not be used to imply anything about the "unspecified type". The spec doesn't mention that they may be nested, or that oneDPL accepts data which nests a oneapi::dpl::begin(sycl_buf) within a zip_iterator. However, since we do choose to utilize zip_iterator in our implementation around input data, we impose this extra requirement on the "unspecified type" beyond the specification.

One option I did consider was using a constexpr check to take the best route to the implementation: if operator+= is available, use that, otherwise try to use copy assignment and operator+(difference_type) . With the specification as it is, it would be preferable to just pass the operators through always in my opinion. However, that is still an option, and it could be a way to avoid changing the specification.

danhoeflinger avatar Apr 01 '24 21:04 danhoeflinger

To summarize an offline conversation about this PR: The oneDPL spec is quite sparse in its language defining custom iterators like zip_iterator, and about what requirements exist on the iterators they are composed of. This needs to be improved. With such improvement, and a more established relationship between operators of the base iterator(s) of our custom iterators, the existing implementation is likely valid. Also, it provides us more options for support of this functionality via multiple avenues based on what is available to us in the base type.

Additionally, we have decided that it is unlikely we will fully remove the custom copy assignment operator of transform_iterator in the immediate future. This reduces the immediate motivation for supporting the functionality without copy assignment.

For now, I'm marking this as a draft pending discussions and improvements to the specification. It is likely preferable to improve the specification for custom iterator requirements, as opposed to adding operators to our "unspecified type".

danhoeflinger avatar Apr 12 '24 15:04 danhoeflinger

With

  1. The changes to transform_iterator in #1445, where we decided to keep copy assignment even when functors are not copy assignable (by skipping the assignment of the functor),
  2. Further discussion of the return type of oneapi::dpl::begin/end, and what we want it to provide (a minimal set of functionality) I am closing this PR.

Technically, we are still deviating from the specification for zip_iterator here, in that we are using + operator and copy assignment rather than += operator, when += is called on a zip_iterator.

We could instead check if += is available in the base types, and use it when it is available, only falling back to this implementation when it is not to accommodate types like sycl_iterator. This would still deviate from the specification, but in a more palatable way.

danhoeflinger avatar Jun 02 '25 14:06 danhoeflinger