Compiling with NAG fails / binary is not usable
Description
Compiling fpm with the NAG Fortran compiler currently fails.
- [x] work around compiler error encountered in NAG for
fpm_command_line.f90(https://github.com/fortran-lang/fpm/pull/753) - [x] update TOML Fortran to include
recursiveattribute in serializer (https://github.com/toml-f/toml-f/pull/113) - [x] fix usage of optional argument when not present (https://github.com/fortran-lang/fpm/pull/753)
- [ ] fix error in test suite for namelist IO
- [ ] fix signed integer overflow error in file checksum
- [ ] correctly resolve module duplicates
Test compilation still fails at namelist IO
cli_test.f90 failed.
[ 5%]Compiling...
NAG Fortran Compiler Release 7.1(Hanzomon) Build 7111
Warning: test/cli_test/cli_test.f90, line 196: Unused dummy variable NAME
Warning: test/cli_test/cli_test.f90, line 250: CMD_BUILD explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_CLEAN explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_INSTALL explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_NEW explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_RUN explicitly imported into PARSE but not used
[NAG Fortran Compiler normal termination, 6 warnings]
test/cli_test/cli_test.f90: In function ‘main_IP_parse’:
test/cli_test/cli_test.f90:246:39: error: ‘host’ undeclared (first use in this function); did you mean ‘cosl’?
write(lun,nml=act_cli,delim='quote')
^
cosl
test/cli_test/cli_test.f90:246:39: note: each undeclared identifier is reported only once for each function it appears in
Neither the symbol host nor cosl are present in the current scope or any scope of the file.
Runtime error due to integer overflow when calculating the file hash
fpm run --compiler nagfor -- build
...
Runtime Error: ./src/fpm_strings.f90, line 198: INTEGER(int64) overflow for 16777619 * 36342609442804091
Program terminated by fatal error
./src/fpm_strings.f90, line 198: Error occurred in FPM_STRINGS:FNV_1A_CHAR
./src/fpm_strings.f90, line 212: Called by FPM_STRINGS:FNV_1A_STRING_T
./src/fpm_source_parsing.f90, line 483: Called by FPM_SOURCE_PARSING:PARSE_C_SOURCE
./src/fpm_sources.f90, line 41: Called by FPM_SOURCES:PARSE_SOURCE
./src/fpm_sources.f90, line 99: Called by FPM_SOURCES:ADD_SOURCES_FROM_DIR
./src/fpm.f90, line 131: Called by FPM:BUILD_MODEL
./src/fpm.f90, line 296: Called by FPM:CMD_BUILD
app/main.f90, line 72: Called by MAIN
I think we actually want an overflow and wrap around here since we are using a signed integer to store a byte sequence.
Erroneous duplicated modules found
fpm run --profile release --compiler nagfor -- run
Project is up to date
Warning: Module test_filesystem in test/fpm_test/test_filesystem.f90 is a duplicate
Warning: Module test_installer in test/fpm_test/test_installer.f90 is a duplicate
Warning: Module test_toml in test/fpm_test/test_toml.f90 is a duplicate
Warning: Module testsuite in test/fpm_test/testsuite.f90 is a duplicate
Warning: Module test_backend in test/fpm_test/test_backend.f90 is a duplicate
Warning: Module test_manifest in test/fpm_test/test_manifest.f90 is a duplicate
Warning: Module test_module_dependencies in test/fpm_test/test_module_dependencies.f90 is a duplicate
Warning: Module test_package_dependencies in test/fpm_test/test_package_dependencies.f90 is a duplicate
Warning: Module test_source_parsing in test/fpm_test/test_source_parsing.f90 is a duplicate
Warning: Module test_versioning in test/fpm_test/test_versioning.f90 is a duplicate
<ERROR>*build_model*:Error: One or more duplicate module names found.
STOP: 1
Looks like the NAG compiled binary fails to correctly resolve the module names.
Expected Behaviour
NAG compiled fpm can build fpm.
Version of fpm
0.6.0
Platform and Architecture
Linux
Additional Information
cc @rouson
Related:
- #752
- #753
I think we actually want an overflow and wrap around here since we are using a signed integer to store a byte sequence.
Doesn't wrap around fit into the category of unspecified behavior? A processor which uses some other representation than two's complement might not perform wrap around.
The integer overflow error at runtime seems to be triggered by the debug NAG compiler flag
https://github.com/fortran-lang/fpm/blob/bb449174631be989fae08532de31e94bac5c5ac3/src/fpm_compiler.f90#L179
which also includes the option:
intovf | (check for integer overflow),
The first error looks mysterious. Is the error test/cli_test/cli_test.f90: In function ‘main_IP_parse’: occuring at compile time or when running the test?
A nice summary of the wraparound behavior in practice, and the flags needed to allow it, are given in the stdlib_hash module specs (thanks to @wclodius2):
In order to properly run the hash functions, the compilers must use two's complement integers, and be able to execute them with wraparound semantics and no integer overflow exceptions. Current Fortran 2003+ compilers solely use two's complement integers, and appear to be able to turn off overflow detection, so the modules use signed integer arithmetic. For that reason trapping on signed arithmetic must be disabled. The command line flags to disable overflow detection for compilers implementing submodules are summarized in the table below. Note that FLANG, gfortran (since version 10), ifort, and NAG all default to integer overflow wrapping.
| Compiler | Legal flag | Illegal flag | Default |
|---|---|---|---|
| ARM Fortran | NA? | NA? | overflow wrapping? |
| Cray Fortran | NA? | NA? | overflow wrapping? |
| FLANG/PGI | -fwrapv | -ftrapv | -fwrapv |
| gfortran | -fwrapv | -ftrapv | -fwrapv |
| IBM Fortran | NA? | NA? | overflow wrapping? |
| ifort | NA? | NA? | overflow wrapping |
| NAG Fortran | -C=none | -C=intovf | -C=none |
| NEC Fortran | NA? | NA? | overflow wrapping? |
| NVIDIA Fortran | NA? | NA? | overflow wrapping? |
In order to be independent of the overflow behavior, would one need to implement his own integer multiplication?
Following some of the resources given below,
I've tried replacing multiplication with a sum of left-shifts at src/fpm_strings.f90#L198:
!> Hash a character(*) string of default kind
pure function fnv_1a_char(input, seed) result(hash)
use, intrinsic :: iso_fortran_env, only: int64
character(*), intent(in) :: input
integer(int64), intent(in), optional :: seed
integer(int64) :: hash
integer :: i
integer(int64), parameter :: FNV_OFFSET_32 = 2166136261_int64
integer(int64), parameter :: FNV_PRIME_32 = 16777619_int64
if (present(seed)) then
hash = seed
else
hash = FNV_OFFSET_32
end if
do i=1,len(input)
#if NOOVERFLOW
hash = ieor(hash,iachar(input(i:i),int64))
hash = hash + lshift(hash,1) + lshift(hash,4) + lshift(hash,7) + &
lshift(hash,8) + lshift(hash,24)
#else
hash = ieor(hash,iachar(input(i:i),int64)) * FNV_PRIME_32
#endif
end do
end function fnv_1a_char
I'm not sure how to specify a test input which would trigger an overflow. If I understand this correctly, with gfortran you can add the flags -DNOOVERFLOW -ftrapv and there shouldn't be any overflow?
PS: Is it okay to be using the _32 prime and offset, when the hash output is int64? According to the FNV home page linked above, the 64 bit FNV_prime = 240 + 28 + 0xb3 = 1099511628211.
Edit: I figured out an example with the integer appearing in the failed test (https://godbolt.org/z/hdnvdj16W):
! hash_test.F90
use, intrinsic :: iso_fortran_env, only: int64
integer(int64) :: hash, overflow
integer(int64), parameter :: FNV_PRIME_32 = 16777619_int64
hash = 36342609442804091_int64
! 16777619*a := a + 2*a + 16*a + 128*a + 256*a + 16777216*a
! uses series of "add" and "sal" instructions
hash = hash + lshift(hash,1) + lshift(hash,4) + lshift(hash,7) + &
lshift(hash,8) + lshift(hash,24)
print *, hash
overflow = 36342609442804091_int64
overflow = overflow * FNV_PRIME_32 ! uses "imul" instruction
print *, overflow
end
$ gfortran hash_test.F90 # overflow followed by wraparound
$ ./a.out
3776084773811324065
3776084773811324065
$ gfortran hash_test.F90 -ftrapv # trap overflow
$ ./a.out
3776084773811324065
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x1027d74ee
#1 0x1027d66fd
#2 0x7ff80f549dfc
#3 0x7ff80f4fe00f
#4 0x7ff80f5341ff
#5 0x7ff80f47fd24
#6 0x102400e10
#7 0x102400d1c
#8 0x102400dd1
Abort trap: 6
The LSHIFT based "multiplication" (also known as "shift and add") works even when overflow trapping is turned on. It still relies on two's complement integer storage. As a side note, replacing -ftrapv with the flag -fsanitize=signed-integer-overflow gives a nicer error:
$ gfortran hash_test.F90 -fsanitize=signed-integer-overflow
$ ./a.out
3776084773811324065
hash_test.F90:11:34: runtime error: signed integer overflow: 36342609442804091 * 16777619 cannot be represented in type 'integer(kind=8)'
3776084773811324065