PostTrainBench
PostTrainBench copied to clipboard
PostTrainBench measures how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours