old_issues_repo
old_issues_repo copied to clipboard
co-ordinated dump collection and transport strategy
Is this a BUG or FEATURE REQUEST?: feature request
Did you review istio.io/help and existing issues to identify if this is already solved or being worked on?: yes Bug: N
What Version of Istio and Kubernetes are you using, where did you get Istio from, Installation details
istioctl version
kubectl version
Is Istio Auth enabled or not ? Did you install istio.yaml, istio-auth.yaml....
What happened:
What you expected to happen:
How to reproduce it:
Feature Request: Y
Describe the feature: I want to understand / discuss the strategy for support document collection mechanism under istio deployments.
use case: an application (such as nodejs webapp) deployed through istio. A support personal would be able to collect:
- core dumps, heapdumps and any other logs (such as console logs) that is produced on an unexpected crash
- core dumps and heapdumps upon a perceived necessity
- to be able to do this on a per-pod basis or on a cluster basis
- to be able to retreive these documents through i) a shared shared volume, ii) a well known port, iii) a exposed API endpoint
While some capability should exist in the runtimes to support this, I guess the side car and the mixer should co-ordinate as well to obtain the desired capability. What is the existing capability in this direction, and what are the gaps? (happy to get engaged if need be).
it would be interesting to support that kind of standardization; it's a good extension to telemetry/observability and health checks
I don't think it would happen for 1.0 but something to consider for long term roadmap - cc @mandarjog
thanks @ldemailly . According to me, the questions we need answers are:
-
When to collect data?
- upon an unexpected program state (such as crash), what else?
- upon a user trigger - how to configure a trigger in the control plane?
- upon a precondition in the application state, what they could be?
-
How to collect data?
- some runtimes (such as Java) has inherent mechanism to produce standard dumps (core|heap|text), but not all?
- some operating systems inhibit (core) dump collection from a second process (such as hardened ubuntu kernel)
- when it comes to standardization, what would be the share of role for the envoy, and what goes into the runtime?
-
Where does the data go?
- local memory / disc of the application is transient, but is there a gap between the crash and the container recycle?
- shared volume - does the envoy have access to one?
- well known port - who exposes the port: the app, the envoy or the mixer?