cache miss on bounded crossbeam_channel::Receiver try_recv
Hi,
I don't know whether this is an expected behavior of Atomics but when I run perf on my code I see that there are cache misses on try_recv call. I have a bounded queue, which has 8 senders and 1 receiver. In the sender sides, I couldn't find any notable miss, because of that I raise that issue for the receiver side.
My call:
loop {
while let Ok(mut messages) = self.parsed_feed_receiver.try_recv() {
...
}
...
}
There are other small misses but they are negligible compared to these 3 lines. If you can help to figure this out, I'll appreciate that. Thanks in advance.
It seems that the access to the field is causing loading from memory, so this patch may improve the situation: https://github.com/crossbeam-rs/crossbeam/compare/master...taiki-e/local-mark-bit
@taiki-e thanks for your quick response! I tried with the new branch, unfortunately it's still same.
Hmm, since mfence does the serialization, the previous cache should not be available for the load from r15+0x80 (i.e., self.tail), but I thought the load from r15+0x190 was the access to self.mark_bit, so I thought that could be omitted. However, normal offset of self.mark_bit is actually 0x108...