cloudberry
cloudberry copied to clipboard
Introduce extension perfmon to monitor database cluster
Fixes #ISSUE_Number
What does this PR do?
This PR introduces the perfmon extension, designed to monitor the Cloudberry Database.
Key Features:
- Cluster-wide Metrics Collection: Gathers performance metrics from each machine in the database cluster.
- Real-time Query Monitoring:
- Displays active queries with detailed execution plans.
- Tracks progress of each query plan node (similar to EXPLAIN ANALYZE for running queries).
- Captures performance metrics (CPU, memory, spill files) across multiple segments.
- Query History: Provides a historical record of executed queries.
Components:
- gpmmon (Shared Library) Added to shared_preload_libraries for background operation. Runs as a backgroud worker process to collect and store metrics in the gpperfmon database.
- gpmon (Shared Library) Added to shared_preload_libraries for query monitoring. Implements hooks to capture runtime query data. Provides the pg_query_state() function to inspect live query execution status.
- gpsmon (Binary) Deployed per-instance to collect system-level metrics (CPU, memory, etc.).
Type of Change
- [ ] Bug fix (non-breaking change)
- [x] New feature (non-breaking change)
- [ ] Breaking change (fix or feature with breaking changes)
- [ ] Documentation update
Breaking Changes
Test Plan
- [ ] Unit tests added/updated
- [x] Integration tests added/updated
- [x] Passed
make installcheck - [ ] Passed
make -C src/test installcheck-cbdb-parallel
Impact
Performance:
User-facing changes:
Dependencies:
Checklist
- [ ] Followed contribution guide
- [ ] Added/updated documentation
- [ ] Reviewed code for security implications
- [ ] Requested review from cloudberry committers