microshift icon indicating copy to clipboard operation
microshift copied to clipboard

NO-ISSUE add prow job analyzing claude agent

Open copejon opened this issue 1 month ago • 8 comments

Init agent that is capable of analyzing CI failures in prow. The agent's workflow focuses on a methodical approach to failure analysis, following these steps:

  1. Create a list errors and failures found in the build.log
  2. Characterize each error and failure based on context from the build log and use this to determine if the error is an infra issue, microshift runtime error, or a legitimate test failure.
  3. Investigate further depending on the nature of the error:
    • For legitimate test errors, analyze the test logs.
    • For runtime errors, download and analyze the sos report
  4. Produce a report based on the findings of step 3.

To invoke the agent, pass the prow job's url to claude, e.g.

$ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/5596/pull-ci-openshift-microshift-main-e2e-aws-tests-arm/1995881118070476800

There's plenty of room for improvement here. For future contributions, consider:

  • Delegation: use sub-agents to perform specialized, lower-level analysis (sos-report agent, microshift source code agent, etc). Especially useful for scoping agent's context to the task
  • Additional workflow steps, e.g. after identifying a legitmate test failure, analyze microshift code base (or diff, for PRs) to determine where the error was introduced.
  • Honing Suggested Remidations: in this PR, the agent is not given much direction on the HOW of error fixing and bases these recommendations off the context it's given.

copejon avatar Nov 24 '25 16:11 copejon

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

openshift-ci[bot] avatar Nov 24 '25 16:11 openshift-ci[bot]

/test test-unit /test verify

copejon avatar Nov 24 '25 16:11 copejon

@copejon hey, should you change this command ? $ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/9999/pull-ci-openshift-microshift-release-4.20-metal-periodic-test/1234567894561234156

I tried to run it using @openshift-ci-analysis <job_url_name>`

kasturinarra avatar Nov 25 '25 10:11 kasturinarra

/lgtm

kasturinarra avatar Nov 26 '25 07:11 kasturinarra

@copejon hey, should you change this command ? $ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/9999/pull-ci-openshift-microshift-release-4.20-metal-periodic-test/1234567894561234156

I tried to run it using @openshift-ci-analysis <job_url_name>`

@kasturinarra That's my fault. The url in the description isn't for a real job. Will fix!

Also, this is structured as an agent. Just passing the url to claude (as long as claude is run in the project root) is enough to trigger the agent.

copejon avatar Dec 02 '25 17:12 copejon

/lgtm /verified by manual-testing

ggiguash avatar Dec 15 '25 07:12 ggiguash

@ggiguash: This PR has been marked as verified by manual-testing.

In response to this:

/lgtm /verified by manual-testing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Dec 15 '25 07:12 openshift-ci-robot

/retitle NO-ISSUE: Claude Agent for analyzing prow jobs

ggiguash avatar Dec 15 '25 07:12 ggiguash

@copejon: This pull request explicitly references no jira issue.

In response to this:

Init agent that is capable of analyzing CI failures in prow. The agent's workflow focuses on a methodical approach to failure analysis, following these steps:

  1. Create a list errors and failures found in the build.log
  2. Characterize each error and failure based on context from the build log and use this to determine if the error is an infra issue, microshift runtime error, or a legitimate test failure.
  3. Investigate further depending on the nature of the error:
  • For legitimate test errors, analyze the test logs.
  • For runtime errors, download and analyze the sos report
  1. Produce a report based on the findings of step 3.

To invoke the agent, pass the prow job's url to claude, e.g.

$ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/5596/pull-ci-openshift-microshift-main-e2e-aws-tests-arm/1995881118070476800

There's plenty of room for improvement here. For future contributions, consider:

  • Delegation: use sub-agents to perform specialized, lower-level analysis (sos-report agent, microshift source code agent, etc). Especially useful for scoping agent's context to the task
  • Additional workflow steps, e.g. after identifying a legitmate test failure, analyze microshift code base (or diff, for PRs) to determine where the error was introduced.
  • Honing Suggested Remidations: in this PR, the agent is not given much direction on the HOW of error fixing and bases these recommendations off the context it's given.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Dec 15 '25 07:12 openshift-ci-robot

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: copejon, ggiguash, kasturinarra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • ~~OWNERS~~ [copejon,ggiguash,kasturinarra]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Dec 15 '25 07:12 openshift-ci[bot]

@copejon: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci[bot] avatar Dec 15 '25 07:12 openshift-ci[bot]