kepler icon indicating copy to clipboard operation
kepler copied to clipboard

[reboot]: Replace documentation to reflect the new reboot features

Open sthaha opened this issue 7 months ago • 2 comments

Kepler Documentation Update Plan

Overview

This document outlines the documentation updates required to align with the Kepler v0.10.0+ rewrite. Each item can be converted into a GitHub issue for tracking.


🚨 Critical Updates (Must Fix Immediately)

1. Configuration Documentation Complete Rewrite

File: docs/usage/general_config.md Status: 🔴 Complete Rewrite Required Description: Current docs describe complex operator-based configuration with 40+ environment variables. New system uses simple CLI flags + YAML hierarchy. Action Items:

  • Replace entire content with new configuration guide structure
  • Document CLI flags table (35+ new flags)
  • Document YAML configuration hierarchy
  • Add configuration examples for common scenarios
  • Remove all references to operator-based configuration Source: Use <kepler>/docs/configuration/configuration.md as template

2. Metrics Documentation Complete Rewrite

File: docs/design/metrics.md Status: 🔴 Complete Rewrite Required Description: Current docs describe complex component-based metrics (core, uncore, package, DRAM, GPU). New system uses simplified CPU-focused metrics with zone-based organization. Action Items:

  • Replace with auto-generated metrics documentation
  • Update metric names and structure
  • Remove component-specific breakdowns
  • Add new zone-based organization
  • Update metric types and labels Source: Use <kepler>/docs/metrics/metrics.md as template

3. Architecture Documentation Complete Rewrite

File: docs/design/architecture.md
Status: 🔴 Complete Rewrite Required Description: Current docs focus on eBPF-centric architecture. New system is service-oriented with clean separation of concerns. Action Items:

  • Replace eBPF-centric content with service-oriented design
  • Document service interfaces (Service, Initializer, Runner, Shutdowner)
  • Add dependency injection patterns
  • Document device/resource/monitor/exporter layers
  • Add thread safety requirements
  • Update data flow diagrams

4. Helm Chart Documentation Complete Rewrite

File: docs/installation/kepler-helm.md Status: 🔴 Complete Rewrite Required
Description: Current docs reference external repository and old configuration. New Helm chart is built-in with comprehensive configuration support. Action Items:

  • Remove external repository references
  • Update installation commands to use manifests/helm/kepler/
  • Rewrite values table with new configuration structure
  • Update port references (9102 → 28282)
  • Add nested YAML configuration examples
  • Document serviceMonitor configuration

📊 Major Updates (Important for Adoption)

5. Installation Manifest Documentation Update

File: docs/installation/kepler.md Status: 🟡 Major Updates Required Description: Build system simplified, many deployment options obsolete. Action Items:

  • Update build commands and Makefile targets
  • Remove obsolete deployment options
  • Update manifest generation process
  • Add new Docker Compose development workflow
  • Update access endpoints and ports

6. Deep Dive Documentation Rewrite

File: docs/usage/deep_dive.md Status: 🔴 Major Rewrite Required Description: Current content focuses on eBPF implementation details. New architecture is service-oriented. Action Items:

  • Remove eBPF implementation details
  • Add service architecture deep dive
  • Document power attribution algorithm changes
  • Update data collection methods
  • Remove hardware counter emphasis
  • Add thread safety and concurrency patterns

7. Daemon Configuration Update

File: docs/usage/kepler_daemon.md Status: 🟡 Major Updates Required Description: Current docs show ConfigMap-based environment variables. New system uses service configuration. Action Items:

  • Update configuration mounting approach
  • Replace environment variable examples
  • Add service-specific configuration
  • Update ConfigMap structure

8. Community Operator Documentation Update

File: docs/installation/community-operator.md Status: 🟡 Updates Required Description: Operator may need updates for new configuration system. Action Items:

  • Verify operator compatibility with new configuration
  • Update operator CR specifications
  • Test and document new configuration integration

9. Kepler Operator Documentation Update

File: docs/installation/kepler-operator.md Status: 🟡 Updates Required Description: Same as community operator. Action Items:

  • Update for new configuration system
  • Verify CR compatibility
  • Update examples and usage patterns

📚 Content Enhancement & Cleanup

10. Create Developer Documentation Section

Files: New docs/developer/ section Status: 🆕 New Content Required Description: New service-oriented architecture needs developer onboarding documentation. Action Items:

  • Create docs/developer/ directory structure
  • Add pre-commit hooks guide (use <kepler>/docs/developer/pre-commit.md)
  • Add release workflow documentation (use <kepler>/docs/design/release.md)
  • Add service architecture guide
  • Add development environment setup (Docker Compose)
  • Add testing strategy documentation

11. Update Navigation Structure

File: mkdocs.yml Status: 🟡 Updates Required Description: Add new developer section and reorganize content. Action Items:

  • Add developer section to navigation
  • Reorganize configuration under appropriate section
  • Update section hierarchies
  • Add new developer workflow guides

12. Local Cluster Documentation Update

File: docs/installation/local-cluster.md Status: 🟡 Updates Required
Description: New Docker Compose development workflow available. Action Items:

  • Add Docker Compose development section
  • Update local development setup
  • Add new build targets and commands
  • Update access endpoints

13. Contributing Documentation Update

File: docs/project/contributing.md Status: 🟡 Minor Updates Required Description: Update for new build system and development workflow. Action Items:

  • Update build system references
  • Add pre-commit workflow
  • Update development environment setup
  • Add service development guidelines

14. Troubleshooting Documentation Update

File: docs/usage/trouble_shooting.md Status: 🟡 Updates Required Description: Less eBPF dependency but troubleshooting patterns may have changed. Action Items:

  • Update eBPF-related troubleshooting
  • Add service-specific troubleshooting
  • Update error patterns and solutions
  • Add new diagnostic commands

🔍 Verification & Minor Updates

15. Model Server Documentation Verification

Files: docs/kepler_model_server/* Status: 🟢 Verification Required Description: Verify compatibility with new architecture and update integration patterns. Action Items:

  • Verify API compatibility
  • Test model server integration
  • Update integration examples if needed
  • Validate workflow documentation

16. Power Model Documentation Update

File: docs/design/power_model.md Status: 🟡 Minor Updates Required Description: Core concepts valid but implementation details changed. Action Items:

  • Update implementation details
  • Verify model integration patterns
  • Update usage scenarios table
  • Check pre-trained model references

17. Energy Sources Documentation Update

File: docs/design/kepler-energy-sources.md Status: 🟡 Minor Updates Required Description: Energy reading concepts still valid but implementation simplified. Action Items:

  • Update implementation details
  • Verify RAPL reading methods
  • Update energy source priorities
  • Check hardware requirements

18. Installation Strategy Update

File: docs/installation/strategy.md Status: 🟢 Minor Updates Required Description: Platform recommendations mostly still relevant. Action Items:

  • Verify platform compatibility
  • Update requirement versions
  • Check deployment method recommendations

19. RPM Installation Update

File: docs/installation/kepler-rpm.md Status: 🟡 Updates Required Description: Build process may have changed. Action Items:

  • Verify RPM build process
  • Update build commands
  • Test installation process
  • Update configuration references

🗑️ Content Removal/Archival

20. eBPF Documentation Archival

File: docs/design/ebpf_in_kepler.md Status: 🔴 Archive/Remove Description: Detailed eBPF implementation no longer central to architecture. Action Items:

  • Move to historical documentation section OR
  • Remove entirely with deprecation notice
  • Update any references to this content
  • Create redirect if needed

📝 Notes

  • Each item above can be converted to a GitHub issue
  • Items marked 🔴 should be prioritized first
  • Items marked 🆕 are new content creation
  • Items marked 🟢 are verification/minor updates
  • Source files from <kepler>/docs/ can be used as templates
  • Auto-generated content (metrics) should use tooling from the rewrite project

sthaha avatar May 21 '25 00:05 sthaha

So in last session, it was mentioned that kepler is moving away from eBPF approach, is this upgrade rebooted in releas 0.8.0 ?

muzakkir6207 avatar Jun 12 '25 09:06 muzakkir6207

@muzakkir6207 , no, release 0.8.0 is the current kepler release which uses ebpf.

You can find the reboot releases in the project's release page - https://github.com/sustainable-computing-io/kepler/releases . Look for everything that has "reboot". The last one is 0.0.7-release

The reboot project source and developer docs can be found at https://github.com/sustainable-computing-io/kepler/tree/reboot

Hope that helps

sthaha avatar Jun 12 '25 12:06 sthaha