SU2
SU2 copied to clipboard
for certain nr of nodes and meshes, mpirun crashes when it cannot find the inlet profile file
Describe the bug If I start SU2_CFD with USE_INLET_PROFILE= YES but the inlet profile does not exist, then an example_profile.dat file will be written for you. When I start in parallel with mpirun, it crashes for certain choices of nr of nodes. For me, it crashes when I choose n=4,6,7,9,10,12 I have no idea where to look for a solution...
[EDIT] So this actually only happens for specific meshes, I could only reproduce it on my turbulent 90 degree pipe bend until now. [EDIT] a mesh where this happens can be found here (90 degree pipe bend) https://github.com/bigfooted/su2cases/tree/master/validation/sudo_pipe_bend If you set SPECIFIED_INLET_PROFILE=YES, then for me, it crashes with mpirun and e.g. n=4.
Can you provide a way to reproduce the problem as the bug report template says? Or do we convert this to a discussion?
What is the common denominator when it crashes? Does the marker appear only on one rank? Or does it appear on more than one? Or is there no pattern related to partitioning? Does the code work when the file exist? i.e. is this related to the reading in general or exclusively with writing the example? If it is related to writing, what happens if you comment out the calls to MPI (to gather coordinates etc.)?
Thank you @pcarruscag. I have placed a link to a testcase. I have updated the title, because it looks like it is not a general problem. By the way, can I visualize the partitions in paraview?
We have a rank output I dont remember the group but dry run should know
Does the code work when the file exist? i.e. is this related to the reading in general or exclusively with writing the example?
Yes, it is only when writing the example_profile, so it doesn't leave CSolver::LoadInletProfile ,although it does reach the end of the routine.
It is possible that while the master rank is writing the template file and throwing the error, another rank starts trying to access the profile data (which was not read) and then segfaults. Try putting a barrier so that ranks don't escape while the template is being written.
if (profile_file.fail()) {
MergeProfileMarkers();
WriteMarkerProfileTemplate();
// barrier here
} else {
ReadMarkerProfile();
}
OK, thanks, SU2_MPI::Barrier definitely helps in narrowing down where the problem is. In MergeProfileMarkers, we get the number of profiles:
for (iMarker = 0; iMarker < config->GetnMarker_All(); iMarker++) {
if (config->GetMarker_All_KindBC(iMarker) == markerType) {
numberOfProfiles++;
Then downstream, we do:
if (rank == MASTER_NODE) {
...
profileCoords.resize(numberOfProfiles);
And when it fails, it is because for MASTER, we did not have the condition that
config->GetMarker_All_KindBC(iMarker) == markerType
,although this condition is true for another rank.
completed by PR