fpm icon indicating copy to clipboard operation
fpm copied to clipboard

Compiling with NAG fails / binary is not usable

Open awvwgk opened this issue 3 years ago • 6 comments

Description

Compiling fpm with the NAG Fortran compiler currently fails.

  • [x] work around compiler error encountered in NAG for fpm_command_line.f90 (https://github.com/fortran-lang/fpm/pull/753)
  • [x] update TOML Fortran to include recursive attribute in serializer (https://github.com/toml-f/toml-f/pull/113)
  • [x] fix usage of optional argument when not present (https://github.com/fortran-lang/fpm/pull/753)
  • [ ] fix error in test suite for namelist IO
  • [ ] fix signed integer overflow error in file checksum
  • [ ] correctly resolve module duplicates
Test compilation still fails at namelist IO
cli_test.f90                           failed.
[   5%]Compiling...
NAG Fortran Compiler Release 7.1(Hanzomon) Build 7111
Warning: test/cli_test/cli_test.f90, line 196: Unused dummy variable NAME
Warning: test/cli_test/cli_test.f90, line 250: CMD_BUILD explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_CLEAN explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_INSTALL explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_NEW explicitly imported into PARSE but not used
Warning: test/cli_test/cli_test.f90, line 250: CMD_RUN explicitly imported into PARSE but not used
[NAG Fortran Compiler normal termination, 6 warnings]
test/cli_test/cli_test.f90: In function ‘main_IP_parse’:
test/cli_test/cli_test.f90:246:39: error: ‘host’ undeclared (first use in this function); did you mean ‘cosl’?
 write(lun,nml=act_cli,delim='quote')
                                       ^
                                       cosl
test/cli_test/cli_test.f90:246:39: note: each undeclared identifier is reported only once for each function it appears in

Neither the symbol host nor cosl are present in the current scope or any scope of the file.

Runtime error due to integer overflow when calculating the file hash
fpm run --compiler nagfor -- build
...
Runtime Error: ./src/fpm_strings.f90, line 198: INTEGER(int64) overflow for 16777619 * 36342609442804091
Program terminated by fatal error
./src/fpm_strings.f90, line 198: Error occurred in FPM_STRINGS:FNV_1A_CHAR
./src/fpm_strings.f90, line 212: Called by FPM_STRINGS:FNV_1A_STRING_T
./src/fpm_source_parsing.f90, line 483: Called by FPM_SOURCE_PARSING:PARSE_C_SOURCE
./src/fpm_sources.f90, line 41: Called by FPM_SOURCES:PARSE_SOURCE
./src/fpm_sources.f90, line 99: Called by FPM_SOURCES:ADD_SOURCES_FROM_DIR
./src/fpm.f90, line 131: Called by FPM:BUILD_MODEL
./src/fpm.f90, line 296: Called by FPM:CMD_BUILD
app/main.f90, line 72: Called by MAIN

I think we actually want an overflow and wrap around here since we are using a signed integer to store a byte sequence.

Erroneous duplicated modules found
fpm run --profile release --compiler nagfor -- run
Project is up to date
 Warning: Module test_filesystem in test/fpm_test/test_filesystem.f90 is a duplicate
 Warning: Module test_installer in test/fpm_test/test_installer.f90 is a duplicate
 Warning: Module test_toml in test/fpm_test/test_toml.f90 is a duplicate
 Warning: Module testsuite in test/fpm_test/testsuite.f90 is a duplicate
 Warning: Module test_backend in test/fpm_test/test_backend.f90 is a duplicate
 Warning: Module test_manifest in test/fpm_test/test_manifest.f90 is a duplicate
 Warning: Module test_module_dependencies in test/fpm_test/test_module_dependencies.f90 is a duplicate
 Warning: Module test_package_dependencies in test/fpm_test/test_package_dependencies.f90 is a duplicate
 Warning: Module test_source_parsing in test/fpm_test/test_source_parsing.f90 is a duplicate
 Warning: Module test_versioning in test/fpm_test/test_versioning.f90 is a duplicate
<ERROR>*build_model*:Error: One or more duplicate module names found.
STOP: 1

Looks like the NAG compiled binary fails to correctly resolve the module names.

Expected Behaviour

NAG compiled fpm can build fpm.

Version of fpm

0.6.0

Platform and Architecture

Linux

Additional Information

cc @rouson

Related:

  • #752
  • #753

awvwgk avatar Sep 15 '22 12:09 awvwgk

I think we actually want an overflow and wrap around here since we are using a signed integer to store a byte sequence.

Doesn't wrap around fit into the category of unspecified behavior? A processor which uses some other representation than two's complement might not perform wrap around.

ivan-pi avatar Sep 15 '22 22:09 ivan-pi

The integer overflow error at runtime seems to be triggered by the debug NAG compiler flag

https://github.com/fortran-lang/fpm/blob/bb449174631be989fae08532de31e94bac5c5ac3/src/fpm_compiler.f90#L179

which also includes the option:

intovf | (check for integer overflow),

ivan-pi avatar Sep 15 '22 22:09 ivan-pi

The first error looks mysterious. Is the error test/cli_test/cli_test.f90: In function ‘main_IP_parse’: occuring at compile time or when running the test?

ivan-pi avatar Sep 15 '22 22:09 ivan-pi

A nice summary of the wraparound behavior in practice, and the flags needed to allow it, are given in the stdlib_hash module specs (thanks to @wclodius2):

In order to properly run the hash functions, the compilers must use two's complement integers, and be able to execute them with wraparound semantics and no integer overflow exceptions. Current Fortran 2003+ compilers solely use two's complement integers, and appear to be able to turn off overflow detection, so the modules use signed integer arithmetic. For that reason trapping on signed arithmetic must be disabled. The command line flags to disable overflow detection for compilers implementing submodules are summarized in the table below. Note that FLANG, gfortran (since version 10), ifort, and NAG all default to integer overflow wrapping.

Compiler Legal flag Illegal flag Default
ARM Fortran NA? NA? overflow wrapping?
Cray Fortran NA? NA? overflow wrapping?
FLANG/PGI -fwrapv -ftrapv -fwrapv
gfortran -fwrapv -ftrapv -fwrapv
IBM Fortran NA? NA? overflow wrapping?
ifort NA? NA? overflow wrapping
NAG Fortran -C=none -C=intovf -C=none
NEC Fortran NA? NA? overflow wrapping?
NVIDIA Fortran NA? NA? overflow wrapping?

In order to be independent of the overflow behavior, would one need to implement his own integer multiplication?

ivan-pi avatar Sep 16 '22 07:09 ivan-pi

Following some of the resources given below,

I've tried replacing multiplication with a sum of left-shifts at src/fpm_strings.f90#L198:

!> Hash a character(*) string of default kind
pure function fnv_1a_char(input, seed) result(hash)
    use, intrinsic :: iso_fortran_env, only: int64
    character(*), intent(in) :: input
    integer(int64), intent(in), optional :: seed
    integer(int64) :: hash

    integer :: i
    integer(int64), parameter :: FNV_OFFSET_32 = 2166136261_int64
    integer(int64), parameter :: FNV_PRIME_32 = 16777619_int64

    if (present(seed)) then
        hash = seed
    else
        hash = FNV_OFFSET_32
    end if

    do i=1,len(input)
#if NOOVERFLOW
        hash = ieor(hash,iachar(input(i:i),int64))
        hash = hash + lshift(hash,1) + lshift(hash,4) + lshift(hash,7) + &
               lshift(hash,8) + lshift(hash,24)
#else
        hash = ieor(hash,iachar(input(i:i),int64)) * FNV_PRIME_32
#endif
    end do

end function fnv_1a_char

I'm not sure how to specify a test input which would trigger an overflow. If I understand this correctly, with gfortran you can add the flags -DNOOVERFLOW -ftrapv and there shouldn't be any overflow?

PS: Is it okay to be using the _32 prime and offset, when the hash output is int64? According to the FNV home page linked above, the 64 bit FNV_prime = 240 + 28 + 0xb3 = 1099511628211.

Edit: I figured out an example with the integer appearing in the failed test (https://godbolt.org/z/hdnvdj16W):

! hash_test.F90
use, intrinsic :: iso_fortran_env, only: int64
integer(int64) :: hash, overflow
integer(int64), parameter :: FNV_PRIME_32 = 16777619_int64

hash = 36342609442804091_int64
! 16777619*a := a + 2*a + 16*a + 128*a + 256*a + 16777216*a
! uses series of "add" and "sal" instructions
hash = hash + lshift(hash,1) + lshift(hash,4) + lshift(hash,7) + &
              lshift(hash,8) + lshift(hash,24)
print *, hash

overflow = 36342609442804091_int64
overflow = overflow * FNV_PRIME_32  ! uses "imul" instruction
print *, overflow
end
$ gfortran hash_test.F90   # overflow followed by wraparound
$ ./a.out
  3776084773811324065
  3776084773811324065
$ gfortran hash_test.F90 -ftrapv   # trap overflow
$ ./a.out
  3776084773811324065

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:
#0  0x1027d74ee
#1  0x1027d66fd
#2  0x7ff80f549dfc
#3  0x7ff80f4fe00f
#4  0x7ff80f5341ff
#5  0x7ff80f47fd24
#6  0x102400e10
#7  0x102400d1c
#8  0x102400dd1
Abort trap: 6

The LSHIFT based "multiplication" (also known as "shift and add") works even when overflow trapping is turned on. It still relies on two's complement integer storage. As a side note, replacing -ftrapv with the flag -fsanitize=signed-integer-overflow gives a nicer error:

$ gfortran hash_test.F90 -fsanitize=signed-integer-overflow
$ ./a.out
  3776084773811324065
hash_test.F90:11:34: runtime error: signed integer overflow: 36342609442804091 * 16777619 cannot be represented in type 'integer(kind=8)'
  3776084773811324065

ivan-pi avatar Sep 16 '22 09:09 ivan-pi