[Doc] Adding known limitation for std::vector<bool>
std::vector<bool> has an optimization to a bitfield rather than individual booleans. For host policies, this will simply not work for parallel algorithms (without some extra processing step / copy, which will kill performance).
On icpx and with device policies it works:
- With the default allocator,
std::vector<bool>::iteratorworks as input to algorithms using a device policy, but only because of the copy to initialize thesycl::buffer. - With
usm_allocator,std::vector<bool>::iteratoralso works, but only because it is treated as if it were a host_iterator, and copied into asycl::buffer. After #1438, this should still work by treating it as a host_iterator and copying to asycl::buffer, because the implementation does not distinguish iterator types based on allocator.
Since this depends on the implementation details of std::vector<bool>, it is safest to just label it undefined behavior, even if it happens to currently work for device policies.
I have tried to find out if the standard says anything about using parallel algorithms with vector<bool>. There is nothing directly said in this regard, and the only indirect restriction I found is that vector<bool> is not required to avoid data races (see [container.requirements.dataraces]) while using parallel execution policies requires the caller to avoid data races (see [algorithms.parallel.exec]). So the standard prohibits modifying vector<bool> in parallel algorithms, but seemingly does not prevent using it for read-only input data.
So, instead of calling it a known limitation, I would add some note to the subsection "Difference with Standard C++ Parallel Algorithms". Also it makes sense to clarify in https://github.com/oneapi-src/oneDPL/edit/main/documentation/library_guide/parallel_api/pass_data_algorithms.rst (subsections "Use Unified Shared Memory" and "Use Host-Side std::vector") that vector<bool> will not work.
In the future, we can perhaps consider checking where our implementation breaks with vector<bool> and see if it can be improved.
I have tried to find out if the standard says anything about using parallel algorithms with
vector<bool>. There is nothing directly said in this regard, and the only indirect restriction I found is thatvector<bool>is not required to avoid data races (see [container.requirements.dataraces]) while using parallel execution policies requires the caller to avoid data races (see [algorithms.parallel.exec]). So the standard prohibits modifyingvector<bool>in parallel algorithms, but seemingly does not prevent using it for read-only input data.
Thanks for these pointers to the specification, this makes sense. I haven't done testing to confirm this, but I believe our implementation likely then complies to the standard.
So, instead of calling it a known limitation, I would add some note to the subsection "Difference with Standard C++ Parallel Algorithms". Also it makes sense to clarify in https://github.com/oneapi-src/oneDPL/edit/main/documentation/library_guide/parallel_api/pass_data_algorithms.rst (subsections "Use Unified Shared Memory" and "Use Host-Side std::vector") that
vector<bool>will not work.
With the spec sections you point to, I believe it is neither a "known limitation" nor a "difference from the standard". We can do some testing to confirm this though.
Despite this, I do I think it would still be helpful to users to draw attention to the fact that std::vector<bool>::iterator should not be used as output data in oneDPL (at least for for C++ Standard Execution Policies) as it is not protected against dataraces, as that may not be clear to those who are less familiar with the details of std::vector<bool>.
I think it probably doesn't belong on the page you mention though, because that is specific to device execution policies, where std::vector<bool>::iterator will actually work both as input and output due to the copy into a sycl::buffer.
I will try to find a better place to put such a note, and update this PR.
In the future, we can perhaps consider checking where our implementation breaks with
vector<bool>and see if it can be improved.
Agreed.
@akukanov I've updated the PR to instead add to the input data page of the guide, with a note about std::vector<bool> (and remove it as a known limitation). Lets wait to merge until I can confirm some things through testing about host-side support of std::vector<bool> as read-only input data.