test-infra icon indicating copy to clipboard operation
test-infra copied to clipboard

Add GitHub self-hosted runners monitoring page to TorchCI HUD

Open Copilot opened this issue 6 months ago • 3 comments

This PR adds a new /runners/[org] page to the TorchCI HUD website that displays GitHub self-hosted runners for any organization, using the PyTorch bot credentials to query GitHub's API.

Features

API Endpoint (/api/runners/[org])

  • Fetches runners from GitHub's /orgs/{org}/actions/runners API endpoint
  • Uses PyTorch bot GitHub App authentication for organization-level access
  • Returns structured data with runner information: id, name, OS, status, busy state, and labels
  • Includes proper error handling for missing permissions, invalid organizations, etc.

Runners Page (/runners/[org])

  • Clean, responsive interface matching GitHub's native runner listing UI
  • Real-time data updates every 30 seconds using SWR
  • Search and filter functionality by runner name, OS, or labels
  • Summary statistics showing total, online, and busy runner counts
  • Color-coded status indicators (🟢 Idle, 🟠 Busy, ⚫ Offline)
  • Label badges that distinguish between custom and read-only labels

UI/UX Features

  • GitHub-style table layout with hover effects
  • Full light/dark theme compatibility using existing CSS custom properties
  • Responsive design that works on mobile devices
  • TypeScript type safety throughout
  • Follows existing TorchCI component patterns and styling

Screenshots

Light mode with search filtering: Light Mode

Dark mode: Dark Mode

Usage

Once deployed with proper GitHub App credentials:

  • Visit /runners/pytorch to monitor PyTorch organization runners
  • Use the search box to filter runners by name, OS, or labels
  • View real-time status updates and runner metadata
  • Toggle between light/dark themes

The implementation includes a demo endpoint (/runners/demo) with mock data for testing the UI without requiring GitHub credentials.

Technical Implementation

The page uses the existing getOctokit() pattern but extends it with a new getOctokitForOrg() function that authenticates at the organization level rather than repository level. This enables access to the organization's self-hosted runners API while maintaining the same security model used throughout the TorchCI application.

All code follows existing TypeScript patterns, uses CSS modules for styling, and integrates seamlessly with the current HUD architecture.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot avatar Aug 19 '25 18:08 Copilot