bytes
bytes copied to clipboard
Sync vs Unsync versions benchmarks on MIPS and ARM
Hi,
I have done some benchmarks based on @stbuehler fork with Unsync implementation (#200). My goal was to test it on MIPS and ARM processors. Generally, sync version shows comparable or better (someteims much better) results. But there are a couple of cases where unsync version is significantly better:
MIPSel (MediaTek MT7628AN ver:1 eco:2):
Unsync test deref_two ... bench: 27,566 ns/iter (+/- 93,022) test from_long_slice ... bench: 1,770 ns/iter (+/- 15,711) = 72 MB/s
Sync test deref_two ... bench: 49,816 ns/iter (+/- 89,548) test from_long_slice ... bench: 3,700 ns/iter (+/- 9,269) = 34 MB/s
ARMv6 (ARMv6-compatible processor rev 7 (v6l))
Unsync test split_off_and_drop ... bench: 2,755,606 ns/iter (+/- 129,117)
Sync test split_off_and_drop ... bench: 3,642,042 ns/iter (+/- 59,619)
Thanks for doing the benchmarks.
Could you clarify the action item for this issue?
This is more like a follow-up to #200 According to this benchmark, there are some specific cases on particular hardware, where current Bytes implementation may be optimized by possible using of unsync version. Do you think it makes sense to implement separate non-atomic version? Or is there any possibility to optimize current implementation for ARMv6 or MIPS?
A related, but more generally applicable idea is to redo BytesMut
as !Sync
and only put the buffer under atomic reference counting on .freeze()
. The rationale is that the contents of a BytesMut
are typically frequently mutated by a single thread or task accumulating input, but frozen buffers need to be handed off to processing threads with zero copying.
A consistent rework in this manner would need to slash mutating APIs on Bytes
to channel people into more performant ways of forming their buffers.
FYI, #269 has a design proposal allowing !Sync
instances without code forks or even feature flags.
@mzabaluev Looks great to me! @carllerche what do you think?
Generally, !Sync
types are pretty annoying to work with as the !Sync
aspect leaks. It's better, IMO, to have types be Sync
and have fns that require mutating the state to take &mut self
.
Generally, !Sync types are pretty annoying to work with as the !Sync aspect leaks.
It (as well as the propagation of !Send
) can be considered a design feature, if it can be tuned in and out of generically. I had a very good experience with tower-h2 where it basically instructed me that my service implementation was unsuitable for the executor I was trying to use due to lacking Send
, but it worked just fine with the current thread executor.
I use tokio-current-thread
heavily, where using !Sync
is just fine. I think it makes a lot of sense to provide the optional way to not pay the unnecessary price for Sync