Segmentation Fault after `assert failure: false. Details: attempted to add shared target SdsApi <NAME_OF_SECRET> to initialized init manager Server
I've emailed [email protected] and was told to raise the issue here.
Please see the Segmentation Fault section at the later part of this email for the stack trace.
Please also see: https://github.com/envoyproxy/go-control-plane/issues/581.
Issues
I've noticed a couple of things.
- When we implement
SecretDiscoveryServiceServer"github.com/envoyproxy/go-control-plane/envoy/service/secret/v3" sometimes theStreamSecretsdoes not get called back at all. - when
StreamSecretsis called, it is called endlessly with a new stream each time, see the log output further down in this message.
When StreamSecrets is not called, obviously, our dynamically supplied secrets are titled dynamic_warming_secrets instead of dynamic_active_secrets.
Code
Our code is public so, here are the definitions of:
StreamSecrets: https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/sds/sds.go#L72- Where we register the
SecretDiscoveryServiceServer: https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/manager/envoy_config_manager.go#L183 - Configuring the
cacheetc: https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/manager/envoy_config_manager.go#L56.
Log of StreamSecrets being called multiple times (the stream=&{0xc000f19c70} is different each time):
2022-08-12T09:19:30Z | sds.go:74: SecretDiscoveryServiceServer.StreamSecrets: exiting method
2022-08-12T09:19:30Z | sds.go:92:
2022-08-12T09:19:30Z | sds.go:93: SecretDiscoveryServiceServer.StreamSecrets: calling stream.Recv - stream=&{0xc000f19380}, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:116: SecretDiscoveryServiceServer.StreamSecrets: request.TypeUrl=type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:153: SecretDiscoveryServiceServer.StreamSecrets: stream.Send(response) sent - responses=[<*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"hmac_secret" generic_secret:{secret:{inline_bytes:"f9eckuGEcUNxAqKT0uK8OyM2Se01ukVLPHsiSoTh2X8="}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret" <*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"client_secret" generic_secret:{secret:{inline_string:"Z6MX7NreJumWLmf6unsQ5uiEUrTBxfNtqG9Vy5Kjktnvfj-_fRCBO9EU1mL1YzAJ"}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"]
2022-08-12T09:19:30Z | sds.go:154:
2022-08-12T09:19:30Z | sds.go:74: SecretDiscoveryServiceServer.StreamSecrets: exiting method
2022-08-12T09:19:30Z | sds.go:92:
2022-08-12T09:19:30Z | sds.go:93: SecretDiscoveryServiceServer.StreamSecrets: calling stream.Recv - stream=&{0xc000f19c70}, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:116: SecretDiscoveryServiceServer.StreamSecrets: request.TypeUrl=type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:153: SecretDiscoveryServiceServer.StreamSecrets: stream.Send(response) sent - responses=[<*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"client_secret" generic_secret:{secret:{inline_string:"Z6MX7NreJumWLmf6unsQ5uiEUrTBxfNtqG9Vy5Kjktnvfj-_fRCBO9EU1mL1YzAJ"}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret" <*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"hmac_secret" generic_secret:{secret:{inline_bytes:"f9eckuGEcUNxAqKT0uK8OyM2Se01ukVLPHsiSoTh2X8="}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"]
What we expect when we look at /config_dump of when StreamSecrets is being called:
{
"@type": "type.googleapis.com/envoy.admin.v3.SecretsConfigDump",
"dynamic_active_secrets": [
{
"name": "client_secret",
"version_info": "2022-08-12T09:22:18Z",
"last_updated": "2022-08-12T09:22:18.140Z",
"secret": {
"@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret",
"name": "client_secret",
"generic_secret": {
"secret": {
"inline_string": "[redacted]"
}
}
}
},
{
"name": "hmac_secret",
"version_info": "2022-08-12T09:22:18Z",
"last_updated": "2022-08-12T09:22:18.386Z",
"secret": {
"@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret",
"name": "hmac_secret",
"generic_secret": {
"secret": {
"inline_bytes": "W3JlZGFjdGVkXQ=="
}
}
}
}
]
}
Don't worry about the exposed the secrets, they'll be removed soon.
Logs or Debug Information
In addition to that, here are dumps taken from the admin endpoint (/config_dump) of when it is working and when it is not working, i.e., when the StreamSecrets gRPC method is not invoked on the go-control-plane and when it is. Notice how the secrets are called dynamic_warming_secrets.
config_dump-broken.json: https://gist.github.com/mbana/61305292ddb9fd83e260a0125893f6calogs-broken.log: https://gist.github.com/mbana/a4636cf5e96a035db618f7c37b8dc275config_dump-working.json: https://gist.github.com/mbana/e9b7b29ed7c1be032aca867304d86d60logs-working.log: https://gist.github.com/mbana/2881bc1873090d8d82649ba95625d789
If there's anything further I can do to help get to the cause of this issue, please let me know.
Envoy Compiled From 7af7608b022c5e3ae2ad110b4d94aa7506a643d7
Edit: I went a step further and compiled Envoy from source and made a Docker image of it:
Dockerfile
FROM docker.io/ubuntu:22.04
COPY envoy-static /usr/local/bin/envoy
COPY envoy-static /usr/bin/envoy
ENTRYPOINT ["/usr/local/bin/envoy"]
Build Steps
$ git remote -vv
origin [email protected]:envoyproxy/envoy.git (fetch)
origin [email protected]:envoyproxy/envoy.git (push)
$ git rev-parse HEAD
7af7608b022c5e3ae2ad110b4d94aa7506a643d7
$ bazel/setup_clang.sh
$ echo "build --config=clang" >> user.bazelrc
$ echo "build --copt=-fno-limit-debug-info" >> user.bazelrc
$ bazel build --jobs=32 -c fastbuild envoy
$ cp bazel-bin/source/exe/envoy-static .
$ docker build --tag ttl.sh/kubeshop/envoy:24h --file ./Dockerfile .
$ docker push ttl.sh/kubeshop/envoy:24h
The image is available at ttl.sh/kubeshop/envoy:24h (docker run --rm -it ttl.sh/kubeshop/envoy:24h). Note: This image is only available for ~24 hours from the time of editing this post (2022-08-12T13:47:31+00:00 UTC).
Segmentation Fault
I noticed that Envoy crashes:
[2022-08-12 13:35:16.253][7][critical][assert] [source/common/init/manager_impl.cc:36] assert failure: false. Details: attempted to add shared target SdsApi client_secret to initialized init manager Server
[2022-08-12 13:35:16.253][7][critical][backtrace] [./source/server/backtrace.h:104] Caught Aborted, suspect faulting address 0x7
[2022-08-12 13:35:16.253][7][critical][backtrace] [./source/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2022-08-12 13:35:16.253][7][critical][backtrace] [./source/server/backtrace.h:92] Envoy version: 7af7608b022c5e3ae2ad110b4d94aa7506a643d7/1.24.0-dev/Modified/DEBUG/BoringSSL
[2022-08-12 13:35:16.272][7][critical][backtrace] [./source/server/backtrace.h:96] #0: Envoy::SignalAction::sigHandler() [0x5579a06c8b28]
[2022-08-12 13:35:16.273][7][critical][backtrace] [./source/server/backtrace.h:98] #1: [0x7f086bc6a520]
AsyncClient 0x550cbfc8ec00, stream_id_: 15115583590745907833
&stream_info_:
StreamInfoImpl 0x550cbfc8edc8, protocol_: 1, response_code_: null, response_code_details_: via_upstream, attempt_count_: 1, health_check_request_: 0, route_name_: upstream_info_:
UpstreamInfoImpl 0x550cbfc81570, upstream_connection_id_: 0
Http2::ConnectionImpl 0x550cbf8d7dc8, max_headers_kb_: 60, max_headers_count_: 100, per_stream_buffer_limit_: 268435456, allow_metadata_: 0, stream_error_on_invalid_http_messaging_: 0, is_outbound_flood_monitored_control_frame_: 0, dispatching_: 1, raised_goaway_: 0, pending_deferred_reset_streams_.size(): 0
&protocol_constraints_:
ProtocolConstraints 0x550cbf8d7e48, outbound_frames_: 0, max_outbound_frames_: 10000, outbound_control_frames_: 0, max_outbound_control_frames_: 1000, consecutive_inbound_frames_with_empty_payload_: 0, max_consecutive_inbound_frames_with_empty_payload_: 1, opened_streams_: 1, inbound_priority_frames_: 0, max_inbound_priority_frames_per_stream_: 100, inbound_window_update_frames_: 4, outbound_data_frames_: 4, max_inbound_window_update_frames_per_data_frame_sent_: 10
Number of active streams: 1, current_stream_id_: 1 Dumping current stream:
stream:
ConnectionImpl::StreamImpl 0x550cbf50bd40, stream_id_: 1, unconsumed_bytes_: 0, read_disable_count_: 0, local_end_stream_: 0, local_end_stream_sent_: 0, remote_end_stream_: 0, data_deferred_: 1, received_noninformational_headers_: 1, pending_receive_buffer_high_watermark_called_: 0, pending_send_buffer_high_watermark_called_: 0, reset_due_to_messaging_error_: 0, cookies_: pending_trailers_to_encode_: null
absl::get<ResponseHeaderMapPtr>(headers_or_trailers_): null
Dumping corresponding downstream request for upstream stream 1:
UpstreamRequest 0x550cbfdfcd80
[2022-08-12 13:35:16.273][7][critical][backtrace] [./source/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
[2022-08-12 13:35:16.273][7][critical][backtrace] [./source/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2022-08-12 13:35:16.273][7][critical][backtrace] [./source/server/backtrace.h:92] Envoy version: 7af7608b022c5e3ae2ad110b4d94aa7506a643d7/1.24.0-dev/Modified/DEBUG/BoringSSL
[2022-08-12 13:35:16.273][7][critical][backtrace] [./source/server/backtrace.h:96] #0: Envoy::SignalAction::sigHandler() [0x5579a06c8b28]
[2022-08-12 13:35:16.273][7][critical][backtrace] [./source/server/backtrace.h:98] #1: [0x7f086bc6a520]
[2022-08-12 13:35:16.292][7][critical][backtrace] [./source/server/backtrace.h:96] #2: Envoy::Http::ResponseDecoderWrapper::dumpState() [0x55799f4d23f8]
[2022-08-12 13:35:16.311][7][critical][backtrace] [./source/server/backtrace.h:96] #3: Envoy::Http::Http2::ClientConnectionImpl::dumpStreams() [0x55799fae8b21]
[2022-08-12 13:35:16.330][7][critical][backtrace] [./source/server/backtrace.h:96] #4: Envoy::Http::Http2::ConnectionImpl::dumpState() [0x55799fae850a]
[2022-08-12 13:35:16.348][7][critical][backtrace] [./source/server/backtrace.h:96] #5: Envoy::Event::DispatcherImpl::onFatalError() [0x5579a031cd35]
[2022-08-12 13:35:16.367][7][critical][backtrace] [./source/server/backtrace.h:96] #6: Envoy::FatalErrorHandler::callFatalErrorHandlers() [0x5579a06cf35a]
[2022-08-12 13:35:16.386][7][critical][backtrace] [./source/server/backtrace.h:96] #7: Envoy::SignalAction::sigHandler() [0x5579a06c8b8a]
[2022-08-12 13:35:16.386][7][critical][backtrace] [./source/server/backtrace.h:98] #8: [0x7f086bc6a520]
Our FatalActions triggered a fatal signal.
Segmentation fault (core dumped)
Can anyone see anything useful in this stack-trace? The assertion failing is particular interesting, but I don't know much about the Envoy code-base to tell if this is an issue or not:
[2022-08-12 13:35:16.253][7][critical][assert] [source/common/init/manager_impl.cc:36] assert failure: false. Details: attempted to add shared target SdsApi client_secret to initialized init manager Server
It's not clear where and when secrets are being used. Could you provide more info on that? cc @adisuissa
It's not clear where and when secrets are being used. Could you provide more info on that? cc @adisuissa
Are you referring to how I create the secrets? If so, here is the part of the code where it's being done:
- https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/sds/sds.go#L198
- https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/sds/sds.go#L229
- https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/config/envoy_configuration.go#L220: Here a snapshot with the secrets is generated to include the secrets added from the above steps.
Also, if you look at the "broken" and "working" /config_dumps you will see that we've configured the envoy.filters.http.oauth2 filter that uses these secret to populate the token_secret (really it's a client_secret in OAuth2 terms, but anyway) and the hmac_secret fields:
config_dump-broken.json: https://gist.github.com/mbana/61305292ddb9fd83e260a0125893f6caconfig_dump-working.json: https://gist.github.com/mbana/e9b7b29ed7c1be032aca867304d86d60
To answer your question, if you search in the above files for client_secret, you'll find that's where they are "defined".
What would be ideal is if the envoy.filters.http.oauth2 filter provided an option to specify the secrets as an inline_string or inline_bytes but unfortunately it doesn't so one has to resort to writing an secret discovery server (SDS). We can't hold the secrets in a file because the envoy.filters.http.oauth2 filter we use needs to be dynamically defined by a user.
Hope that all makes sense, if not, please let me know.
https://github.com/envoyproxy/envoy/issues/22625 might be related, seems like secret subscription management in oauth2 filter is broken.
Hello, @kyessenov, hope you're well.
- Is there an ETA for this?
- Are there unittests in the code base that test out the SDS and ensures that it polls/streams for secrets? If so, where would should I look for those?
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
@kyessenov, hello, please keep this open until a fix arrives for it.
Many thanks.