materialize icon indicating copy to clipboard operation
materialize copied to clipboard

pgwire::protocol::StateMachine::send_rows keeps large amount of memory allocated

Open def- opened this issue 1 year ago • 5 comments

What version of Materialize are you using?

d8783cd45e (Pull Request #25313)

What is the issue?

We had high memory usage during explain plans before (https://github.com/MaterializeInc/materialize/issues/23451), but now I noticed send_rows continuously allocating 800 MB until environmentd is killed: https://buildkite.com/materialize/release-qualification/builds/436#018db189-6b84-4a39-a32a-49bee0a7a4da mz_2-2024-02-16_12 21 13.pb.gz profile001 Is it expected that this memory is never reclaimed again? In this case we didn't see an OoM, but I was trying to reproduce one and this stood out to me as possibly unexpected. The test only ran explain plans.

def- avatar Feb 16 '24 12:02 def-

Got the OoM reproduced with heap profile, the last profile still looked similar but had grown to 1800 MB: mz_2-2024-02-16_14 16 23.pb.gz profile001

def- avatar Feb 16 '24 15:02 def-

Can you explain the workload that triggered this? sqlsmith continuously runs a bunch of queries? Does the 800mb stay around even after all connections are closed and no new ones open? This should be reproable locally by running the sqlsmith mzcompose?

madelynnblue avatar Feb 16 '24 22:02 madelynnblue

Yes, sqlsmith runs a bunch of explain plans continuously for 6000 s, nothing else.. Let me try if bin/mzcompose --find sqlsmith run default --explain-only --runtime 6000 reproduces it and whether the memory usage stays that high after a while of nothing running against Mz.

def- avatar Feb 16 '24 22:02 def-

Actually when disconnecting SQLsmith the environmentd memory usage goes down again, so it's not a memory leak, but a lot allocated to support the connection.

def- avatar Feb 17 '24 00:02 def-

That function holds on to the entire result set as a big Vec<Row> so that absolutely tracks. We don't currently stream in the results, they are sent as a giant blob from compute into the coordinator then into the adapter/connection which sends them over the network.

madelynnblue avatar Feb 17 '24 00:02 madelynnblue

Closing because working as intended.

madelynnblue avatar Feb 21 '24 19:02 madelynnblue