rabbitmq-dotnet-client
rabbitmq-dotnet-client copied to clipboard
Do not create string objects from consumerTag, exchange and routingKey, or get them from a string cache
Proposed Changes
See issue #1231
Types of Changes
What types of changes does your code introduce to this project?
- [ ] Bug fix (non-breaking change which fixes issue #NNNN)
- [ ] New feature (non-breaking change which adds functionality)
- [X] Breaking change (fix or feature that would cause an observable behavior change in existing systems)
- [ ] Documentation improvements (corrections, new content, etc)
- [ ] Cosmetic change (whitespace, formatting, etc)
Checklist
- [X] I have read the
CONTRIBUTING.md
document - [X] I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
- [ ] All tests pass locally with my changes
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] I have added necessary documentation (if appropriate)
- [ ] Any dependent changes have been merged and published in related repositories
I'd prefer to split off at least the confirmTag into it's own PR. (as it contains a lot of special code to handle things, simpler to review track errors)
The other two are so similar that they can remain in this one.
As mentioned in the issue, I think we should look into also providing new API that take ROM
@bollhals This is just a draft PR, which was created for the request of @bollhals Of cource once your team decides how to implement it, (if it helps to you) I could make another (separated) PRs. The comsumerTag and other chages are already separated to 2 commits.
I do not have a strong opinion on whether this should be two separate PRs. However,
As mentioned in the issue, I think we should look into also providing new API that take ROM for the exchangeName/RoutingKey
would be nice and can be included into this PR or a separate one.
I improved similarly to the .NET stream client here, avoiding tons of allocations.
We need to test the PR deeply, but it makes sense to me.
The problem here is the breaking changes.
@Gsantomaggio We are using this PR in our application since almost a year ago without any problem... Today I only updated it to the latest code. (nothing else changed in this PR)
@zgabi thank you! We appreciate you keeping this PR up-to-date. As you can see, we're making progress on version 7.0.
@lukebakken Could you please run the build again, there was a problem in the APIApproval file
Well, it's going to take some time to resolve conflicts with the current state of main
.
@zgabi - if you have time to start resolving conflicts, I'd appreciate it.
@zgabi @stebet I'm assuming that this code defeats the entire purpose of this PR:
https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1233/files#diff-1d0ab4919eef61c6b1cc4efcc8849c4fc04b84cf71adcf4f757855de8ec2bfa3R120-R123
But, I'm wondering what we can do to still support this information in tracing?
@zgabi PS please see this comment
You can write the following:
string routingKey = Encoding.UTF8.GetString(deliverEventArgs.RoutingKey.Span);
string exchange = Encoding.UTF8.GetString(deliverEventArgs.Exchange.Span);
So only string allocation is needed, no array.
And put the exchange line to line 136, so create only when needed in the if statement.
I'm not very familiar with the latest code.. is this Deliver executed every time when you deliver a message or only in some special case, e.g. when some tracing is enabled?
edit: added a new commit with this change
thanks for updating the code... you were faster :) .. sorry, I had to work today
Thank you @zgabi!!!!
@zgabi thanks for all of your work on this. I'm planning to merge it by the end of the day tomorrow (16 May) unless there are further changes needed.
Please see this comment: https://github.com/rabbitmq/rabbitmq-dotnet-client/issues/1231#issuecomment-2115566382
@zgabi @bollhals @michaelklishin take a look at this commit:
https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1233/commits/d4e88ca738219fbcdac09324e10b8ead38966c73
I would like to replace the usage of ReadOnlyMemory<byte>
and string
, when used for exchange and queue names, and routing keys, with the dedicated types in that commit. Thoughts?
Hi,
So the parameters of the of the HandleBasicDeliver will be alsoe ExchangeKey
and RoutingKey
?
For me it is ok.. I had a different branch/commit for this change (However they are using CachedStrings, but not so hard to separate them to the 2 new types):
https://github.com/zgabi/rabbitmq-dotnet-client/commits/routingkeyandexchange
This branch was an alternative to the "stringallocations" branch.
@zgabi I realize this PR now contains an enormous number of changes, but the vast majority of them are the following:
- Replace
string
withRoutingKey
,QueueName
,ExchangeName
orConsumerTag
, as appropriate. - Update APIs with the above
Here are the "big" changes -
- Consumer dispatchers now pass
RentedMemory
for the method and body part of a delivery, which prevents string allocations unless the handler explicitly triggers them in theAmqpString
instance - https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1233/files#diff-74d6657893b833f7aefa9d2a615ae0d7dc1c4cfefa6615461c1b1638ff4b17abR53-R59 -
This is one such place where casting an instance of an
AmqpString
subclass will result in a string allocation.
@zgabi if you wouldn't mind re-running your allocation tests that would be great. Or, let me know how you did them and I will run them. Thanks!
While this change and the effort from you @lukebakken are honerable, I think it is currently way too big to properly analyze and judge the impact of the change. And also it contains many unrelated changes, making it even harder to judge.
My proposal would be to split off any unrelated changes, to pass them individually.
Regarding the main topic of the PR:
- With the introduction of these types, it's much clearer what you're suppose to pass as a parameter. But this comes at a tradeoff as well:
- You always have to go through an indirection (the type) if you want / need it or not
- You need to allocate this intermediate type, which slows you down if you do not plan on reusing it.
As it is right now, I think it is further away from what it tried to acheive than where the main is right now ... :/
IMHO, having classes wrapping string / bytes is only useful if
- You should be reusing them / they're not supposed to be allocated just for this one call
- The library has a cache to avoid the allocation of it (e.g. Avoid allocating the exchange string + ExchangeName object for the delivery of messages, as they're often the same over and over again.)
Hi @bollhals thanks for your input.
The library has a cache to avoid the allocation of it (e.g. Avoid allocating the exchange string + ExchangeName object for the delivery of messages, as they're often the same over and over again.)
Well, that's exactly what this PR accomplishes with the AmqpString
/ExchangeName
types, so there's that at least 😄.
There is no cache, but upon delivery, ExchangeName
only points to the backing ReadOnlyMemory<byte>
of the incoming data. A string is only initialized from it if the user takes an action to do so. Once the delivery handler exits, the memory is released.
~~I will see about moving other changes to a different PR.~~
Actually, nearly all of the changes in this PR are due to the migration from string
to dedicated types for exchange name etc, so I don't see the value in a separate PR.
IMHO, having classes wrapping string / bytes is only useful if
@bollhals there is another benefit - I have had RabbitMQ support cases that boil down to the wrong string passed as the wrong parameter to a library call. Having dedicated types makes that kind of mistake much less likely.
IMHO, having classes wrapping string / bytes is only useful if
@bollhals there is another benefit - I have had RabbitMQ support cases that boil down to the wrong string passed as the wrong parameter to a library call. Having dedicated types makes that kind of mistake much less likely.
Sure, that's what I ment by stating to make it much clearer what to pass. But the tradeoff is the allocation of the objects and the indirection through another object.
It depends in the end what the goal is:
- Increasing performance? => I have a feeling that these additional objects & indirection made it slower for most real world usecases. (Perf tests needed to confirm)
- Increasing usability? => Fine, this surely helps, but then the PR Title should be changed, as the goal of not allocating string objects becomes a side thing.
I'll run some performance tests. I don't think I have access to memory profiling tools to see how this PR affects allocations ... @zgabi ?
@zgabi @bollhals I have force-reset the stringallocations
branch to 2320480c286d460d71d997661413594117c01c29, which is just before I went crazy with this PR. I saved my work on another branch.
- Increasing performance? => I have a feeling that these additional objects & indirection made it slower for most real world usecases. (Perf tests needed to confirm)
Sometimes, minimal local extra work/complexity avoids extra processing, allocations, GC work. It really needs to be measured.
Here is what I see when comparing allocations between the stringallocations
branch and the current main
(as of 1600PDT 2024-05-22), using the MassPublish
program to send 64 batches of 1024 messages per batch.
Running from the command line, each branch appears to take about the same amount of time. Memory usage appears to be about the same.
@zgabi @bollhals @paulomorgado thoughts? Right now I don't really see a reason to include this PR in version 7 and would happily move it to version 8 (where taking allocations into account will be a first-class priority).
stringallocations
branch
main
branch
@lukebakken, I haven't looked at the stringallocations
branch in detail, but I have some generic considerations.
If the stringallocations
branch has 98,479 strings and the main
branch has 265,205 strings, where did those strings go?
What are those strings? Can those strings be pooled/cached?
One of the benefits of System.Text.Json
is working with UTF-8 instead of UTF-16 (the encoding of System.String
. System.Char
is 2 bytes long). And to truly benefit from that, there are APIs working on byte
alongside char
.
I find very worrying the existence of System.ArrayEnumerator
instances in the heap. This is an indication that they are being boxed. Coupling this with System.Linq.Enumerable
(a known performance killer), I'd say LONQ to Objects is being used and shouldn't.
What is System.Collections.Generic.LinkedList<System.UInt64>
being used for?
Hi @paulomorgado
If the stringallocations branch has 98,479 strings and the main branch has 265,205 strings, where did those strings go? What are those strings? Can those strings be pooled/cached?
That's the point of this PR, to reduce string allocations when this library handles basic.deliver
from the AMQP server. Exchange name and routing key used to be converted to strings, and this PR keeps them as ReadOnlyMemory<byte>
I find very worrying the existence of System.ArrayEnumerator instances in the heap. This is an indication that they are being boxed. Coupling this with System.Linq.Enumerable (a known performance killer), I'd say LONQ to Objects is being used and shouldn't.
That's easy to fix. Visual Studio "helpfully" suggested adding a .Cast<>
in a couple foreach
loops. I will open a new PR.
What is System.Collections.Generic.LinkedList<System.UInt64> being used for?
C:\Users\lbakken\development\rabbitmq\rabbitmq-dotnet-client_main [main ≡]
> git grep LinkedList
projects/RabbitMQ.Client/client/impl/ChannelBase.cs:63: private readonly LinkedList<ulong> _pendingDeliveryTags = new LinkedList<ulong>();
Hi @lukebakken
If the stringallocations branch has 98,479 strings and the main branch has 265,205 strings, where did those strings go? What are those strings? Can those strings be pooled/cached?
That's the point of this PR, to reduce string allocations when this library handles
basic.deliver
from the AMQP server. Exchange name and routing key used to be converted to strings, and this PR keeps them asReadOnlyMemory<byte>
As you pointed out, the only change is the reduced number of string
instances and not overall memory or CPU.
So, what's the benefit here?
Is it the tests that are not taking advantage of this?
C:\Users\lbakken\development\rabbitmq\rabbitmq-dotnet-client_main [main ≡] > git grep LinkedList projects/RabbitMQ.Client/client/impl/ChannelBase.cs:63: private readonly LinkedList<ulong> _pendingDeliveryTags = new LinkedList<ulong>();
So, it's kind of a queue, but you need to remove items from both ends and, sometimes, anywhere. Is that it?
@zgabi - if you have time, please read @paulomorgado's observations:
As you pointed out, the only change is the reduced number of string instances and not overall memory or CPU. So, what's the benefit here? Is it the tests that are not taking advantage of this?
I am running the MassPublish
program using the ".NET Object Allocation Tracking" performance monitor, and, as @paulomorgado notes, there does not seem to be a reduction in memory use.
At this point I'm getting the feeling like I'm testing this incorrectly, perhaps?
Reducing the allocations reduces the GC overhead only. You wrote that the running time and memory is almost the same... but is the CPU usage also the same? I think it should be bigger (because of multiple threads) when there are a lot of allocations.
I'm assuming the same test conditions, otherwise the tests are not comparable.