lfortran
lfortran copied to clipboard
Consider making `--realloc-lhs` the default
@difference-scheme reported at https://github.com/lfortran/lfortran/issues/2940#issuecomment-2970101356:
But in order to get the fixed behavior one needs to compile with either the option --realloc-lhs or --std=f23 (which includes the former). I can foresee this need for specifying an extra compiler option to become a major trouble-spot for compiling object-oriented code in the future.
Such code will almost certainly make extensive use of polymorphic assignments (like the one for object adder above) and having to provide an extra compiler option for this, that the user might be unaware of, will become the source of endless future complaints. I believe --realloc-lhs should be made the default behavior in LFortran.
We already make an exception for allocatable strings, so we can also make an exception for polymorphic assignments.
The main motivation for --realloc-lhs to be off by default is performance for array assignments, as well as that automatic LHS reallocation hides bugs (the user might think an array was pre-allocated correctly, but the compiler silently reallocates). I think we should thus keep --realloc-lhs off by default, however the following features should just work, by default:
- allocatable strings (we already do this)
- allocatable polymorphic types (both scalars, possibly also in arrays)
Another idea is to always do automatic reallocation of LHS for scalars, but not arrays. That might be a simple rule that might cover the cases above.
In either case, we have to have two modes configurable with --realloc-lhs and --no-realloc-lhs. By default we will start with --no-realloc-lhs, which leaves the door open to later switch to --realloc-lhs; the alternative is to make --realloc-lhs the default, but that closes the door to later switch, as it will break user's code, so we won't start with this.
No matter the default, the two modes have to be supported. In --realloc-lhs, every LHS gets reallocated. In --no-realloc-lhs, we will still reallocate strings and scalar polymorphic types. But will will not reallocate arrays. To be decided what to do with reallocatable scalars.
And we have to add checks for out of bounds and non-correctly-allocated LHS: https://github.com/lfortran/lfortran/issues/7891.
An example how GFortran handles the two modes:
$ cat expr2.f90
program expr2
implicit none
integer, allocatable :: x(:)
integer :: y(3)
allocate(x(10))
y = 3
x = y
end program
output:
$ gfortran -fcheck=all -g expr2.f90 && ./a.out
$ gfortran -fcheck=all -g -fno-realloc-lhs expr2.f90 && ./a.out
At line 7 of file expr2.f90
Fortran runtime error: Array bound mismatch for dimension 1 of array 'x' (10/3)
Error termination. Backtrace:
#0 0x100fefcc3 in ???
#1 0x100ff0b07 in ???
#2 0x100ff0ebf in ???
Could not print backtrace: DW_FORM_line_strp out of range in .debug_line at 38
#3 0x100913c9f in ???
#4 0x100913d1b in ???
GFortran does the above for allocatable scalars as well (on by default, and bounds check error without realloc LHS). However for strings:
program expr2
implicit none
character(:), allocatable :: x
character(3) :: y
y = "abc"
x = y
print *, x
end program
it behaves as:
$ gfortran -fcheck=all -g expr2.f90 && ./a.out
abc
$ gfortran -fcheck=all -g -fno-realloc-lhs expr2.f90 && ./a.out
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x100debcc3 in ???
#1 0x100dead23 in ???
#2 0x19a7fb623 in ???
Could not print backtrace: DW_FORM_line_strp out of range in .debug_line at 38
#3 0x1009efdcf in ???
So it segfaults. This means that the bounds checking is not super strong for strings without automatic reallocation of LHS.
First of all, the compiler should give a nice runtime error, not segfault. However, for strings it makes sense to always reallocate LHS, and treat it as part of the string behavior.
It seems that allocatable scalars like integer are rarely used, and if they always automatically allocate, not a big deal. However allocatable scalars of polymorphic type are used all the time, and just like strings, it seems you always want them to automatically allocate, as part of the OOP behavior.
Consequently, this simple rule seems like a good design:
- Always do automatic reallocation of LHS for scalars of all types (integer, character, polymorphic type, etc.), but not arrays
- The
--realloc-lhsreallocates arrays (of all types) - The
--no-realloc-lhsdoes not reallocates arrays (of any type), and in Debug mode (without `--fast) it will always give a runtime error, and in Release mode it will be undefined (segfault), for performance.
In a sense, the option becomes --realloc-array-lhs (or something like that).