Add custom path-specific metrics proposal
Custom Path-Specific Metrics for GVisor
Summary
This proposal introduces path-specific metrics to GVisor's existing metrics system, enabling fine-grained monitoring of filesystem access patterns for specific directories or mount points.
Motivation
Current GVisor metrics provide excellent general visibility into filesystem operations, but lack the granularity needed to understand application behavior at the path level. This limitation makes it difficult to:
- Monitor access patterns to specific directories (e.g., mounted volumes, shared storage)
- Track filesystem usage by application components
- Debug performance bottlenecks in specific parts of the filesystem hierarchy
- Implement security monitoring for sensitive paths
Proposed Solutions
The proposal outlines two complementary approaches:
- Full Syscall Configuration: Complete flexibility to specify both paths and specific syscalls to track
- Simplified Path-Only Configuration: Streamlined approach focusing on common read/write metrics
Both solutions use a --path-metrics-config flag with YAML configuration files, making them easy to deploy and maintain while providing the observability needed for production environments.
Benefits
- Enhanced Observability: Path-level insights into application filesystem behavior
- Performance Optimization: Identify filesystem bottlenecks and access patterns
- Security Monitoring: Track access to sensitive directories
- Flexible Configuration: Support for both simple and advanced monitoring scenarios
- Minimal Performance Impact: Only track configured paths and syscalls
Implementation Approach
The proposal is designed to be:
- Non-intrusive to existing GVisor functionality
- Configurable and optional (disabled by default)
- Extensible for future enhancements
- Compatible with existing metrics infrastructure
This enhancement would significantly improve GVisor's observability capabilities while maintaining its security and performance characteristics.
Note: This is a proposal document for discussion. I'm interested in feedback on the approach and would be happy to collaborate on the implementation if there's interest from the maintainers.
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
View this failed invocation of the CLA check for more information.
For the most up to date status, view the checks section at the bottom of the pull request.
This proposal is trying to do a lot of things at once imo. Some thoughts:
Implement security monitoring for sensitive paths
I believe this use-case is better suited for runtime monitoring tools (see https://gvisor.dev/blog/2022/08/01/threat-detection/).
I am also skeptical about introducing a configuration yaml file via --path-metrics-config, which I think this implies introducing a yaml parser into runsc.
Debug performance bottlenecks in specific parts of the filesystem hierarchy
If the goal is to debug various filesystem types implemented in gVisor, I think pprof should be sufficient since it will be able to break down CPU time spent in different code paths, for example.
+1 that the use-case of monitoring specific files or directories for access should be part of the gVisor runtime monitoring system, not the metrics system.
If there is something to expand on the metric system, perhaps that could be a fstype string dimension that reflects the type of the filesystem on which an operation is being performed like tmpfs, overlayfs (as that has performance/monitoring implications), rather than the mountpoint (which is a user-specified value and thus not directly mappable to a specific aspect of gVisor code).
Thank you for the thoughtful feedback! You've raised excellent points that help refine this proposal. Let me address each concern:
- Runtime Monitoring vs Metrics: We benchmarked the runtime monitoring solution for high syscall workloads, and we came to the conclusion that it can take up quite a lot of CPU which is something we are sensitive on, and cant spare. Another proposal we have is that what if we implement a rate limiter for runtime monitoring, that is given a rate limit in the config.json file for the runtime monitoring solution, we rate limit the number of points streamed to the server to that rate limit. Wdyt about that over this solution? The rate limiter would be implemented in the
seccheckdirectory, and when the rate-limit is set to a value in the config - the rate limiter is activated and accordingly, rate limits the syscall points, it could be implemented using - https://en.wikipedia.org/wiki/Token_bucket - YAML Configuration Concerns: Is the YAML parsing concern mitigated by using something similar to the parser for runtime monitoring json config, for example change the YAML to a json file and re-using the same framework as the json parser for Runtime Monitoring
- Filesystem Type Metrics: @EtiennePerot's suggestion about fstype dimensions is excellent and addresses a core need. This would be incredibly useful for the team to understand tmpfs (RAM) vs overlayfs (DISK) performance characteristics, could you tell us how exactly to determine the fstype for a syscall?
TLDR:
- Implement a rate limiter for runtime monitoring instead of this solution
- YAML configuration concerns mitigated by using a runtime monitoring similar json
- How to determine the fstype for a syscall to introduce these metrics?
The runtime monitoring system is currently designed as a real-time monitoring system mostly for threat detection and autonomous response to events happening within the sandbox, so one of its design goals is to be real-time, and this is also why it has high CPU usage. So rate-limiting would go against its current implementation. It may be possible to make it configurable to not act this way, i.e. to have it send event information in batches instead. That would increase its CPU efficiency, at the cost of losing real-time-ness. For your purposes of monitoring path-specific metrics, that seems like a worthwhile tradeoff.
could you tell us how exactly to determine the fstype for a syscall?
No silver bullet. In the gVisor metrics system (not runtime monitoring), there are already some metrics like:
https://github.com/google/gvisor/blob/d6ba9944e431c768b028f010e0992bd0829207c0/pkg/sentry/vfs/file_description.go#L642
This counts the number of reads across all file descriptors. But it is not filesystem-implementation-specific, so it can't determine the fstype there. Not much of a better mechanism than to move such metrics down into the FileDescriptionImpl implementations' PRead method instead.
The metric system does have support for fields, so this could remain a single fs/reads metric but with a new fstype string field. To create a metric with fields, set the Fields field in metric.Uint64Metadata.
https://github.com/google/gvisor/blob/d6ba9944e431c768b028f010e0992bd0829207c0/pkg/metric/metric.go#L225-L232
But again, for your use-case you likely want to add instrumentation in the runtime monitoring subsystem, not in metrics. The metrics system is high-performance but very limited; for example, string fields must pre-declare all of their possible values, so it would be impossible to use it for path-specific field values as those would only be known at runtime.
Hi @EtiennePerot , How do you feel about this solution: We accept a time interval in the json config, and for every time interval we store the syscalls in memory, and after that time interval has elapsed we just send the points all together, that means in batches. Would a PR which solves this be okay to be merged into the open source gvisor repository? I can submit a PR which does that soon for you review.
That sounds good to me, so long as this is configurable and that the default behavior is still the current real-time behavior (no batching), so that existing users of runtime monitoring which rely on its real-time-ness maintain such behavior. cc @fvoznika for confirmation.