rtp-llm
rtp-llm copied to clipboard
Buffer overflow at CudaAttentionOpTest::selfAttentionOpTest
Definition: std::vector<void*> block_pointers(batch_size * 2 * maxBlocksPerSeq, nullptr);
... auto kv_cache = device_->allocateBuffer( {DataType::TYPE_UINT64, {(size_t)batch_size, maxBlocksPerSeq}, AllocationType::HOST}, {});
Copy size larger than dst size: std::memcpy(kv_cache->data(), block_pointers.data(), block_pointers.size() * sizeof(void*));
block_pointers.size() * sizeof(void*) = batch_size * 2 * maxBlocksPerSeq * sizeof(void*) kv_cache size() = batch_size * maxBlocksPerSeq * sizeof(unsigned long)
Hi, thank you for your comment on this bug. we are working on a fix now. btw, how did you find this ?