perf-tests icon indicating copy to clipboard operation
perf-tests copied to clipboard

Expose more information in clusterloader2 logs

Open wojtek-t opened this issue 6 years ago • 6 comments

There are a couple things that we definitely need:

  • [x] more state about pods from a given controlling object (number of pending, waiting, checking if something was deleted, etc.). Mostly copying this logic: https://github.com/kubernetes/kubernetes/blob/master/test/utils/runners.go#L803
  • [x] pod-startup-time latency should output thing that is somewhat similar to what we currently do (for debugging purposes)
  • [x] show more clearly where a given test finished:
W1112 13:18:48.029] I1112 13:18:48.029322    9960 clusterloader.go:127] Test testing/density/config.yaml ran successfully!"

is not very visible in those logs

  • [x] We are currently printing about the information about nodes that is extremely helpful for debugging (this is currently part of density). It would be useful to add that too (it should probably be part of initialization of cluster loader)
  • [ ] You need to audit logs in measurements - a bunch of glog`s should actually be real failures and fail the test at the end (though not immediately). I can imagine this as something like: https://github.com/kubernetes/kubernetes/issues/66239#issuecomment-405255089, but also as a special measurement that inside is collecting errors (and gloging them when they happen) and at the end fails if any logs were reported (should be simpler than a separate flakes.txt file).

I guess there may be more, but let's start with those.

/assign @krzysied

wojtek-t avatar Nov 12 '18 14:11 wojtek-t

@kubernetes/sig-scalability-bugs

wojtek-t avatar Nov 12 '18 14:11 wojtek-t

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Feb 10 '19 15:02 fejta-bot

/remove-lifecycle stale

@krzysied - what's the status of this?

wojtek-t avatar Feb 11 '19 07:02 wojtek-t

@wojtek-t First 4 points are done. The last one is partially done. There is no errors that immediately fail test, however there is no flake.txt file.

krzysied avatar Feb 11 '19 09:02 krzysied

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar May 15 '19 20:05 fejta-bot

/remove-lifecycle stale /lifecycle frozen

wojtek-t avatar May 16 '19 06:05 wojtek-t