SPSCQueue
SPSCQueue copied to clipboard
Assertion failed
Hi!
I've been using SPSCQueue for some time and just updated to the latest version and now hit this:
SPSCQueue.hpp:188: void Rigtorp::SPSCQueue<T, Allocator>::pop() [with T = PacBio::CCS::Chunk; Allocator = std::allocator<PacBio::CCS::Chunk>]: Assertion `writeIdx_.load(std::memory_order_acquire) != readIdx' failed.
Any advice? Thank you!
To reproduce:
#include <complex>
#include <array>
#include "SPSCQueue.h"
int main() {
rigtorp::SPSCQueue< int > queue(16);
for (int j = 0; j < 6; ++j) {
queue.emplace(0);
}
for (int j = 0; j < 3; ++j) {
queue.pop();
}
}
If I dump indexes after each operation this way
template<typename T>
void dump(T& q) {
std::cout << q.writeIdx_ << " " << q.writeIdxCache_ << " " << q.readIdx_ << " " << q.readIdxCache_ << std::endl;
}
It is clear this is caused by the diverged non-atomic caches and the real values:
1 0 0 0 # push
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
6 0 1 0 # pop
6 0 2 0
6 0 3 0
6 0 3 0
If ensure the equality between the caches and the indexes in the end of the pop
call, the problem disappears. Sure it's just a dirty fix because I wouldn't like to dig into the logic.
readIdxCache_ = nextReadIdx;
writeIdxCache_ = writeIdx_;
@galchinsky On which CPU are you running? For example, Apple Silicon CPU behaves a bit differently than Intel, as I could experiment (SPSCQueue was not involved).
@Philippe91 it's a simple single-threaded example with a few push and a few pop operations, I don't think CPU can make a difference here.
I submitted a PR to fix this, see PR "Bug fix for #35".
I submitted a PR to fix this, see PR "Bug fix for #35".
This solves the problem, apparently.
This being said, the logic behind the cached indexes system introduced by @rigtorp is challenging to grasp in my mind. It gives me an insecure feeling, reinforced by the presence of this bug, introduced 18 months ago, and to which the author has never reacted since it was reported ten months ago.
I am considering reverting to the implementation prior 3a0507a193b0573d87823759d5c8c1230598bfbc.
I just moved to Facebook's queue, because who knows are there any other bugs like that
I have the same problem.
#include <iostream>
#include "SPSCQueue.h"
using namespace rigtorp;
int main(int argc, char* argv[]) {
SPSCQueue<int> q(10000);
q.push(1);
q.pop();
return 0;
}
Output: Assertion failed: writeIdx_.load(std::memory_order_acquire) != readIdx, file D:\Programs\Test\test1\SPSCQueue.h, line 184
If add q.front();
before q.pop();
, the code will run properly.
And function front()
has some bug.
RIGTORP_NODISCARD T* front() noexcept {
auto const readIdx = readIdx_.load(std::memory_order_relaxed);
if (readIdx == writeIdxCache_) {
writeIdxCache_ = writeIdx_.load(std::memory_order_acquire);
if (writeIdxCache_ == readIdx) {
return nullptr;
}
}
return &slots_[readIdx + kPadding];
}
If you run the pop()
function first, readIdx_
will change from 0 to 1, while the initial value of writeIdxCache_
is 0, so they will be unequal and writeIdxCache_
will not update until readIdx_
reaches the end of the array and becomes 0 again. During this time, the front()
function will always return an object regardless of whether the queue is empty.
It's just an assertion fail in the class destructor because writeIdxCache_
value doesn't reflect writeIdx_
as it is only updated in the *front()
member function when writeIdxCache_ == readIdx_.load(std::memory_order_relaxed)
and only if it is so writeIdxCache_
gets updated, hence the fault. The sane fix is to check if size() > 0
in the class destructor loop.
Because you weren't supposed to call *front()
when the size()
is zero and that's what the class destructor does, it failed to check bounds as it should be.
There seems to be a misunderstanding of the pre-conditions of calling pop()
. I updated the documentation here: https://github.com/rigtorp/SPSCQueue/commit/40661138be2245d7d52854e4fb3b7f6bb4bf02ad
This is a concurrent queue intended to be used by two threads. In that case the only way for the reader thread to know if there is more data to read is by calling front()
. Only once it returns a non-nullptr
can you call pop()
.
This code invokes undefined behavior (and triggers the assert):
SPSCQueue<int> q(10000);
q.push(1);
q.pop();
This code correctly checks that the queue is not empty:
SPSCQueue<int> q(10000);
q.push(1);
if (q.front()) {
q.pop();
}
Since you are not reading front()
, you are not communicating any data to the other thread, so I don't understand why you even need the queue?
I don't remember the details (even where the assertion got triggered), but here are 3 chunks of code I had before moving to the facebook's queue, and checking for front after checking for emptiness or size doesn't seem obvious to me.
while (!queue.empty()) {
lapper.put_sample(*queue.front());
queue.pop();
}
another one
while (queue.size() > 4) {
queue.pop();
}
and
while (queue.front()) {
DBOUT("$" << queue.writeIdx_ << " " << queue.writeIdxCache_ << " " << queue.readIdx_ << " " << queue.readIdxCache_);
queue.pop();
}
Apparently, given there are debug outputs, it was the last one that triggered the assertion. Pushing was being done this way, in a separate thread:
bool success = queue.try_push(workbuf);
if (success) {
last_timestamp = buf.timestamp;
}
Yes, those first two are invalid and will break the queue invariants.
I really should never have used the front() and pop() APi from std::queue. Instead they should be a single transaction try_pop() that can return a value. Or a zero-copy version that visits the value in-place using a lambda.
On Mon, Sep 25, 2023 at 5:43 AM Dmitry Galchinsky @.***> wrote:
I don't remember the details (even where the assertion got triggered), but here are 3 chunks of code I had before moving to the facebook's queue, and checking for front after checking for emptiness or size doesn't seem obvious to me.
while (!queue.empty()) { lapper.put_sample(*queue.front()); queue.pop(); }
another one
while (queue.size() > 4) { queue.pop(); }
and
while (queue.front()) { DBOUT("$" << queue.writeIdx_ << " " << queue.writeIdxCache_ << " " << queue.readIdx_ << " " << queue.readIdxCache_); queue.pop(); }
Apparently, given there are debug outputs, it was the last one that triggered the assertion. Pushing was being done this way, in a separate thread:
bool success = queue.try_push(workbuf); if (success) { last_timestamp = buf.timestamp; }
— Reply to this email directly, view it on GitHub https://github.com/rigtorp/SPSCQueue/issues/35#issuecomment-1733414452, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLO26TGO4YASZ6VM3UXY3X4FN55ANCNFSM5SFF52EQ . You are receiving this because you were mentioned.Message ID: @.***>