Albany icon indicating copy to clipboard operation
Albany copied to clipboard

Performance regressions on blake

Open jewatkins opened this issue 2 years ago • 8 comments

There's a couple there we need to investigate: https://sandialabs.github.io/ali-perf-data/ali/blake_nightly_data/Ali_PerfTestsBlake_02_01_2023.html

jewatkins avatar Feb 01 '23 21:02 jewatkins

@jewatkins do you think it's doable/interesting to add the commit first line to the overlay window that appears when scrolling over a point in the perf history graph? I know we can take the printed sha, and go see what it was, but it feels like it would me much more immediate to understand (to some extent) what happened at any point in time...

To be clear I'm thinking of upgrading the overlay window

Date: 2022-11-02
Albany commit: f6db667
Trilinos commit: 409aaf2
Mean: ...
Ratio: ...

to something like

Date: 2022-11-02
Albany commit: f6db667
    Correcting path in script.
Trilinos commit: 409aaf2
    Merge Pull Request #11210 from gsjaardema/Trilinos/SEACAS-Nemesis-cmake-fix
Mean: ...
Ratio: ...

bartgol avatar Feb 01 '23 22:02 bartgol

Yeah we could probably do that and it makes sense. Maybe even a link would be useful. The commit ids are read/stored from simulation output but the python script that reads/stores the data could probably query the commit id and also store the first line description. I'll add it as an issue.

jewatkins avatar Feb 01 '23 22:02 jewatkins

Do you guys know why there are not run performance tests on Blake? https://sems-cdash-son.sandia.gov/cdash/viewTest.php?onlynotrun&buildid=45031 . The error is failed dependency: https://sems-cdash-son.sandia.gov/cdash/test/3264091 .

ikalash avatar Feb 08 '23 19:02 ikalash

It looks like the tests that are creating the populated mesh are crashing. E.g. this

bartgol avatar Feb 08 '23 20:02 bartgol

Seems like an MPI error. It could be a temporary issue of the testbed.

bartgol avatar Feb 08 '23 20:02 bartgol

Hmm, this same error showed up before (Issue #904). I wasn't able to replicate it when it showed up originally and I think it went away the next day. Seems like some sort of system issue.

mcarlson801 avatar Feb 08 '23 20:02 mcarlson801

Hmmm could be a system issue. I will keep an eye on it. Thanks!

ikalash avatar Feb 08 '23 20:02 ikalash

this is a different issue. I will create a new issue for the failing tests that need to be reconfigured.

jewatkins avatar Feb 09 '23 17:02 jewatkins