go-control-plane
go-control-plane copied to clipboard
Memory retained after connections closed
What version of go-control-plane are you using?
-
github.com/envoyproxy/go-control-plane v0.10.3-0.20221003170831-bf9fc1db9d0f
-
google.golang.org/protobuf v1.28.0
-
google.golang.org/grpc v1.45.0
What version of Go are you using (go version)
go version go1.18.5 linux/amd64
What did you do?
I write a control plane by using go-control-plane, which sends config resources to envoy.
snapshot, err := cache.NewSnapshot(
p.newSnapshotVersion(),
map[resource.Type][]types.Resource{
resource.ListenerType: x.ListenerContents(),
resource.ClusterType: x.ClusterContents(),
resource.EndpointType: x.EndpointsContents(),
resource.RouteType: x.RoutesContents(),
},
)
I open 190 envoy proxy clients to connect to my control plane. The memory usage of my control plane is 10G. (Get memory usage by k8s metric container_memory_usage_bytes
)
Then I close all envoy proxy clients, but the memory usage is still too high. (4G)
What did you expect to see?
- control plane eats less memory when serving 190 clients
- control plane when memory when all envoy clients closed
What did you see instead?
- control plane eats 10G memory when serving 190 envoy clients, 10G is too much.
- control plane didn't release memory when I closed all the clients.
pprof
This is the pprof file on when I close all envoy clients. (4G memory usage)
pprof.gala.alloc_objects.alloc_space.inuse_objects.inuse_space.004.pb.gz
Can you please let us know if you were using SOTW or Delta xDS?
@alecholmez
I'm using the SOTW to return the config.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
@bwangelme I took a second look at this, looks like the pprof graph you dropped noted it's only using 105.75MB? Could somewhere else in your management server be causing the high memory usage? We use this internally at greymatter and we don't have any usage like that. It's generally sub 1G
As mentioned by @alecholmez those numbers do not match numbers we see for other users. The snapshots will not be cleaned from the cache if not explicitly removed, but this seems like those memory usages are unrelated. It's also unclear where the memory is used (within this library, or in the user code managing configurations stored in this control-plane)
Hi, we finally identified the root cause of the issue.
source.ConfigSourceSpecifier = &core.ConfigSource_ApiConfigSource{
ApiConfigSource: &core.ApiConfigSource{
TransportApiVersion: resource.DefaultAPIVersion,
ApiType: core.ApiConfigSource_GRPC,
SetNodeOnFirstMessageOnly: true,
GrpcServices: []*core.GrpcService{{
TargetSpecifier: &core.GrpcService_EnvoyGrpc_{
EnvoyGrpc: &core.GrpcService_EnvoyGrpc{ClusterName: "xds_cluster"},
},
}},
},
}
I setted the GrpcServices
on EdsClusterConfig
, it is causing trouble for every envoy cluster create a connection to xds service.
We have approximately 200 envoy pod, with 400 cluster on each envoy pod, these envoys are connecting to a single xds service, resulting in high memory consumption on the xds service pod.
Thanks for your reply.
Thanks for your reply. Yes when not using ads envoy uses a separate stream per cluster. Also because of the node only being sent on the first node the go-control-plane library ends up keeping a copy of the node metadata per stream, which may be very impactful if a lot of data is included in there