quickfixj icon indicating copy to clipboard operation
quickfixj copied to clipboard

During a ResendRequest the request for messages from storage allows for unbounded memory usage

Open philipwhiuk opened this issue 2 years ago • 4 comments

When a resend request to Infinity (or even just a large capped number) is made, this can cause a huge number of messages to be fetched from storage into memory which can cause the application to crash due to lack of memory. It's not practical to mitigate this in the storage layer

In addition even if they fit in memory without a fix to #271 this can lead to some fairly ugly behaviour where we continue to attempt to send messages to a session that's already disconnected us.

To Reproduce

  • Send several thousand messages (volume necessary dependent on application memory profile)
  • Instruct the counter-party to perform a resend-request of all these messages

Expected behavior We should fetch the messages in batches and then send them a batch at a time

System information: N/A

Additional context I've got a hot fix for some of this - will try to tidy it up and submit for review.

This is probably additionally key if we implement #621

philipwhiuk avatar Jun 01 '23 08:06 philipwhiuk

As you noted on https://github.com/quickfix-j/quickfixj/issues/621#issuecomment-1572077072 probably it would be a good thing to have a config option to restrict the maximum number of messages that could be resent. The beginning of the range could be skipped over by setting the SequenceReset tag NewSeqNo accordingly.

chrjohn avatar Jun 16 '23 09:06 chrjohn

Yeah, a maximum resend request is probably also a good idea. I'd suggest that there's several things you could do:

  1. Logout the counter-party (the session is an invalid state, just like we do when the sequence number is too high
  2. Cap the resend amount to the limit and gap-fill
  3. No limit, use the batching

At some point I also need to look a throttling send - when we fixed our batching (and we really did want to send them all... we effectively DDoSed our counter-party until they caught up - ideally we could have throttled it so they stayed connected).

philipwhiuk avatar Jun 16 '23 13:06 philipwhiuk

https://github.com/quickfix-j/quickfixj/issues/778 Could solve your problem

wajncn avatar Mar 20 '24 13:03 wajncn

@wajncn actually there was already a PR created by @philipwhiuk which should solve this: #643 Does this also work for you?

chrjohn avatar Jul 03 '24 13:07 chrjohn