LearnWebGPU icon indicating copy to clipboard operation
LearnWebGPU copied to clipboard

Buffer overflow/alignment issues in step 34

Open petar-andrejic opened this issue 6 months ago • 1 comments

Thanks for the nice tutorial, it's been very easy to follow. Just a small issue I noticed so far regarding buffer alignment:

Step 34 has a potential buffer overflow if the size of the index buffer is not already a multiple of four. Specifically, if the size ends up being padded, writing to the buffer will read in adjacent memory to indexData without any warning, as it is now strictly larger in size. It would probably be best to handle this properly by padding the CPU side of things as well, and warn readers that this is necessary.

Minimal working example of the issue, using dawn:

#include <iostream>

#include <webgpu/webgpu_cpp.h>

int main() {
    wgpu::InstanceDescriptor instance_desc{
        .features{
            .timedWaitAnyEnable = true,
        },
    };
    wgpu::Instance instance = wgpu::CreateInstance(&instance_desc);

    wgpu::RequestAdapterOptions adapter_opts{};
    wgpu::Adapter adapter;
    wgpu::Future requestAdapterFuture = instance.RequestAdapter(
        &adapter_opts, wgpu::CallbackMode::WaitAnyOnly,
        [&](wgpu::RequestAdapterStatus, wgpu::Adapter _adapter, char const*) {
            adapter = _adapter;
        });
    instance.WaitAny(requestAdapterFuture, UINT64_MAX);

    wgpu::Device device;
    wgpu::DeviceDescriptor device_desc{};
    wgpu::Future requestDeviceFuture = adapter.RequestDevice(
        &device_desc, wgpu::CallbackMode::WaitAnyOnly,
        [&](wgpu::RequestDeviceStatus, wgpu::Device _device, char const*) {
            device = _device;
        });
    instance.WaitAny(requestDeviceFuture, UINT64_MAX);

    wgpu::Queue queue = device.GetQueue();

    constexpr size_t itemSize = 3;

    wgpu::BufferDescriptor buffer_desc{
        .usage = wgpu::BufferUsage::CopyDst | wgpu::BufferUsage::MapRead,
        .size = itemSize * sizeof(uint16_t),
        .mappedAtCreation = false,
    };
    buffer_desc.size = (buffer_desc.size + 3) & ~3;
    assert(buffer_desc.size == 8);
    wgpu::Buffer buffer = device.CreateBuffer(&buffer_desc);

    struct SomeData {
        uint16_t cpuData[3] = {0, 1, 2};
        uint16_t someOtherData[5] = {3, 4, 5, 6, 7};
    };

    SomeData sd;

    queue.WriteBuffer(buffer, 0, sd.cpuData, buffer.GetSize());
    wgpu::Future mapFuture =
        buffer.MapAsync(wgpu::MapMode::Read, 0, buffer.GetSize(),
                        wgpu::CallbackMode::WaitAnyOnly,
                        [](wgpu::MapAsyncStatus status, char const* msg) {
                            if (status != wgpu::MapAsyncStatus::Success) {
                                std::terminate();
                            }
                        });
    instance.WaitAny(mapFuture, UINT64_MAX);
    uint16_t const* result =
        static_cast<uint16_t const*>(buffer.GetConstMappedRange());

    for (size_t i = 0; i < itemSize + 1; i++) {
        std::cout << "result[" << i << "] = " << size_t(result[i]) << '\n';
    }
}

The output is

result[0] = 0
result[1] = 1
result[2] = 2
result[3] = 3

since the buffer overshot the extent SomeData.cpuData and ended up reading data from SomeData.someOtherData as well.

petar-andrejic avatar Aug 18 '24 09:08 petar-andrejic