flakiness with `clEnqueueCopyImage`
I am experiencing flakiness on different platforms (ARMs & few Intels) running some CTS copy_images tests.
It's always because of incorrect data in the output buffer.
I am wondering if we are missing something in cvk_command_image_image_copy::build_batchable_inner:
cl_int cvk_command_image_image_copy::build_batchable_inner(
cvk_command_buffer& cmdbuf) {
VkImageSubresourceLayers srcSubresource =
prepare_subresource(m_src_image, m_src_origin, m_region);
VkOffset3D srcOffset = prepare_offset(m_src_image, m_src_origin);
VkImageSubresourceLayers dstSubresource =
prepare_subresource(m_dst_image, m_dst_origin, m_region);
VkOffset3D dstOffset = prepare_offset(m_dst_image, m_dst_origin);
VkExtent3D extent = prepare_extent(m_src_image, m_region);
VkImageCopy region = {srcSubresource, srcOffset, dstSubresource, dstOffset,
extent};
vkCmdCopyImage(cmdbuf, m_src_image->vulkan_image(), VK_IMAGE_LAYOUT_GENERAL,
m_dst_image->vulkan_image(), VK_IMAGE_LAYOUT_GENERAL, 1,
®ion);
return CL_SUCCESS;
}
In other copies (image to buffer, buffer to image, image init) I see some vkCmdPipelineBarrier. Should we have one here as well?
Adding the following barrier after the copy did not fix the issue:
VkMemoryBarrier memoryBarrier = {
VK_STRUCTURE_TYPE_MEMORY_BARRIER, nullptr, VK_ACCESS_TRANSFER_WRITE_BIT,
VK_ACCESS_MEMORY_WRITE_BIT | VK_ACCESS_MEMORY_READ_BIT};
vkCmdPipelineBarrier(cmdbuf, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT,
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, 0, 1,
&memoryBarrier, 0, nullptr, 0, nullptr);
Right, the plot thickens. How reproducible is it?
It fails often enough to be easily reproducible. Depending on the test and the GPU, it varies from 10% to more than 50% of the time.