nalu-wind
nalu-wind copied to clipboard
OpenFAST segfault in CPU builds
Some tests are failing in CPU builds with segfaults that appear to be from OpenFAST. Presumably these are all the same bug, although we can split up this issue if we find out that's wrong.
Unit tests:
- ActuatorBulkDiskFastTest.NGP_sweptPointsPopulatedVaried
- ActuatorBulkFastTests.NGP_initializeActuatorBulk (note this test also fails in a Cuda build)
Reg tests:
- nrel5MWactuatorDisk
- nrel5MWactuatorLine
- nrel5MWactuatorLineAnisoGauss
- nrel5MWactuatorLineFllc
- nrel5MWadvActLine
@psakievich were you looking at this?
This appears to just be an issue with [email protected]. @rafmudaf have you had a chance to look at this yet?
@psakievich No I haven't seen this. Can you point me to the test logs?
@rafmudaf see here: https://my.cdash.org/viewTest.php?onlyfailed&buildid=2173766
This is the relevant section from one of the failing tests -- unitTest1 (all are similar):
[----------] 3 tests from ActuatorFunctorFastTests
[ RUN ] ActuatorFunctorFastTests.NGP_runAssignVelAndComputeForces
**************************************************************************************************
OpenFAST
Copyright (C) 2022 National Renewable Energy Laboratory
Copyright (C) 2022 Envision Energy USA LTD
This program is licensed under Apache License Version 2.0 and comes with ABSOLUTELY NO WARRANTY.
See the "LICENSE" file distributed with this software for details.
**************************************************************************************************
OpenFAST--128-NOTFOUND
Compile Info:
- Compiler: GCC version 9.3.0
- Architecture: 64 bit
- Precision: double
- OpenMP: No
- Date: Jun 1 2022
- Time: 14:38:44
Execution Info:
- Date: 06/03/2022
- Time: 00:58:17-0600
OpenFAST input file heading:
FAST Certification Test #01: NREL 5.0 MW Baseline Wind Turbine (Onshore)
Running ElastoDyn.
Nodal outputs section of ElastoDyn input file not found or improperly formatted.
Running AeroDyn.
Warning: Turning off Unsteady Aerodynamics because UA parameters are not included in airfoil
(airfoil has likely has constant polars). (node 1, blade 1)
Warning: Turning off Unsteady Aerodynamics because UA parameters are not included in airfoil
(airfoil has likely has constant polars). (node 1, blade 2)
Warning: Turning off Unsteady Aerodynamics because UA parameters are not included in airfoil
(airfoil has likely has constant polars). (node 1, blade 3)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 12201 RUNNING AT rhodes.hpc.nrel.gov
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
@rafmudaf do you have further thoughts on this?
I built openfast on my laptop using apple-clang and gcc@11 and did not see the seg faults. So it is likely not the input deck as I suspected. I will try to update them anyways though.
see https://github.com/OpenFAST/openfast/pull/1227
After that OpenFAST PR merged as well as #1023, we seem to be down to segfaults only on clang on the SNL dashboard. @psakievich these segfaults don't show up on the NREL dashboard. The main differences between the two are:
- SNL build is release, NREL build is debug
- SNL uses Clang 12.0.1, NREL uses Clang 10.0.0
- SNL uses trilinos@develop, NREL uses trilinos@stable
Okay I will try to get back to this soon
@tasmith4 I'm unable to reproduce this locally on ascicgpu22. I'm wondering if the [email protected] build is segfaulting because that openfast build didn't get updated after we added the patch?
I ran the unittests and some of the regression tests with a debug build and have not hit anything. It seems rather suspicious to me that it is failing so consistently and across every openfast test, but no other compiler on either of the dashboards sees the segfault.