flang
flang copied to clipboard
521.wrf_r segfaults when built with -flto=full -O2
After upgrading to LLVM 17, the 521.wrf_r benchmark in SPEC CPU2017 is miscompiled at O3 with LTO enabled. Examination of the crash site shows that a large amount of code that used to be inlined into wrf_init
(via alloc_and_configure_domain
) has been deleted, leaving the head_grid
pointer uninitialized.
The root cause of this problem is that upstream LLVM had strengthened the pruning of unreachable code to handle undef
and poison
. Specifically, these three patches are found to be relevant:
- https://github.com/llvm/llvm-project/commit/1fc425380e
- https://github.com/llvm/llvm-project/commit/70aca7b122
- https://github.com/llvm/llvm-project/commit/41895843b5
Classic Flang leaves variables uninitialized by default. The corresponding IR values are therefore considered to contain undef
. In our downstream compiler, we have tested with initializing all variables to zeroes, and that successfully avoids the problem.
628.pop2_s is also affected by this problem.
As discussed at the call, I will follow up in this issue with information about which variables are left uninitialized in the source, and whether the problem occurs at -O2
/-O3
.
The problem does occur at -O2
and -O3
.
The compiler flags I used are:
-Mallocatable=03 -flto=full -O2 -mcpu=native -ffast-math -Mbyteswapio
The linker flags I used (with ld.lld
) are:
-flto=full -O2 -mcpu=native -ffast-math
The problematic undef
values that caused some basic blocks to be pruned correspond to the dummy arguments ipex
and jpex
in the function compute_memory_dims_rsl_lite
(module_dm.F90
, line 1088). This function is called by patch_domain_rsl_lite
, which passes a whole bunch of actual arguments, none of which is initialized before the call.
@pawosm-arm @shivaramaarao Could you check things on your side to see if you can reproduce the problem, or determine why you don't see the problem?
Hi @bryanpkc assuming the problem manifests itself at runtime, I need to ask, what input data set are you using? (test, ref, or something else). Also, with -mcpu=native being used at build time, what are the CPU features (namely, NEON vs SVE)?
@pawosm-arm I have confirmed that the problem can be reproduced without specifying any -mcpu=
option. The run-time segmentation fault happens during benchmark start-up; I am using the test
data set.
Been testing it all night in our CI, with your flags, with O3 and even with Ofast, and it didn't fail. It's time to do it manually, but bear in mind, building spec's wrf with LTO take ages, and I want to try two different builds of the compiler, so it may take days before I'd confirm anything.
Just noticed, our CI always adds -funroll-loops
to the explicitly specified set of flags, but I doubt it matters
...also we don't seem to use (or even build) lld, but it also should make no difference on what LTO does when optimizing the code
Should 628.pop2_s
also fail during benchmark start-up? It takes forever to execute it, so I guess it's not affected in my builds...
@pawosm-arm Can you confirm in the 521.wrf_r
source files that the actual arguments I mentioned are not initialized? I am using an older version of the benchmark (v1.0.5) and perhaps SPEC has changed the source since then?
[AMD Official Use Only - General]
I could reproduce the issue with -O2 and -flto. I need to still check why we don’t see the issue in our downstream compiler. Will update when I have additional information.
Regards, Shivaram
From: Bryan Chan @.> Sent: Friday, November 3, 2023 7:59 PM To: flang-compiler/flang @.> Cc: Rao, Shivarama @.>; Mention @.> Subject: Re: [flang-compiler/flang] 521.wrf_r segfaults when built with -flto=full -O2 (Issue #1429)
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
@pawosm-armhttps://github.com/pawosm-arm Can you confirm in the 521.wrf_r source files that the actual arguments I mentioned are not initialized? I am using an older version of the benchmark (v1.0.5) and perhaps SPEC has changed the source since then?
— Reply to this email directly, view it on GitHubhttps://github.com/flang-compiler/flang/issues/1429#issuecomment-1792536289, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AN4MBMCOJCP4VSWYFXTVJHDYCT5SNAVCNFSM6AAAAAA6ZJZ3WOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJSGUZTMMRYHE. You are receiving this because you were mentioned.Message ID: @.@.>>
@pawosm-arm Can you confirm in the
521.wrf_r
source files that the actual arguments I mentioned are not initialized? I am using an older version of the benchmark (v1.0.5) and perhaps SPEC has changed the source since then?
I've got two versions here: 1.0.1 and 1.1.5. The module_dm.F90
file is exactly the same in both. These params are not initialized at call site, but they don't have to, they have all INTENT(OUT) in the calleee, so if any of those variables is used below the call site, the compiler has no liberty in removing any code that contributes to the value they're having assigned.
Since I can't reproduce the problem with the most recent release of our compiler, I'll try to build spec2k17's wrf with vanilla classic.
Since I can't reproduce the problem with the most recent release of our compiler, I'll try to build spec2k17's wrf with vanilla classic.
Indeed, wrf fails (with segfault) right after start when built like that (spec2k17 ver. 1.1.5), but pop2 doesn't. Note that stack size was set to unlimited (this is to prevent usual reason for wrf's segfaulting)
As I mentioned above, our downstream compiler has an option to zero-initialize variables automatically, so we have a workaround, which we may be able to upstream. gfortran has a similar set of options. Oracle's Fortran compiler has an -xcheck=init_local
option that initializes variables to trapping values, for debugging purpose. ifort has a similar -check uninit
option.
I had a quick look at the language standard. While it discusses definition and undefinition of variables, it does not specify any default initialization behaviour, and uses of uninitialized variables are undefined behaviour. From what I can see in online discussions, programmers are encouraged to define variables explicitly.
What should be our next step? Any opinion? @pawosm-arm @shivaramaarao
I'm slightly confused. Doesn’t classic-flang already zero-initialize variables (at least globals/module variables)?
I think the global/module variables are allocated in the BSS section and zero-initialized only as a side effect. Local variables that are not explicitly initialized are left undefined.