webpage icon indicating copy to clipboard operation
webpage copied to clipboard

In tutorial, using x = foo or x(:,:) = foo?

Open Beliavsky opened this issue 4 years ago • 7 comments

In the tutorial there are lines of code such as

  print *,A(i,1:m)
  mat(:,:) = 0.0

that could be written

  print *,A(i,:)
  mat = 0.0

I would either show the shorter syntax or show both versions and explain that they are equivalent. I would not want new Fortran programmers to get in the habit of using array sections when they can use refer to the whole array. I have read in the past about compilers that do not optimize x(:,:) as well as x. Maybe that has been fixed.

Beliavsky avatar Apr 15 '21 20:04 Beliavsky

Even simpler:

  print *, A(i,:)
  mat = 0

certik avatar Apr 15 '21 20:04 certik

There is a difference:

mat(:,:) = another_matrix

would only work correctly if the other matrix has the same dimensions, whereas:

mat = another_matrix

could cause an automatic reallocation if mat is an allocatable and another_matrix has a different shape or mat is not allocated.

Such subtleties would complicate the tutorial, of course, so perhaps this should be discussed in a separate section in the tutorial.

Op do 15 apr. 2021 om 22:34 schreef Ondřej Čertík @.***

:

Even simpler:

print *, A(i,:) mat = 0

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/webpage/issues/75, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR766MG4DR43FZLU7G3TI5ET7ANCNFSM43AGKLTQ .

arjenmarkus avatar Apr 16 '21 07:04 arjenmarkus

Yes, as Arjen points out, there is an important difference and I recall a discussion where it was mentioned that the slice notation is preferred for two main reasons:

  • It is more explicit/expressive in showing the array operation and ensuring that array dimensions match
  • It avoids a possible automatic reallocation (which can really catch you out!)

It is probably a good idea to describe both notations and describe the differences. I would be interested to hear more about the possible optimisation problems with slice notation if you're able to recall a particular example.

LKedward avatar Apr 16 '21 08:04 LKedward

There is a somewhat relevant post of @sblionel on this topic here from 2008: https://stevelionel.com/drfortran/2008/03/31/doctor-it-hurts-when-i-do-this/

awvwgk avatar Apr 16 '21 08:04 awvwgk

Thanks @awvwgk, it was the vague recollection of such posts that prompted my issue. Below is an excerpt. The fortran-lang tutorial should not have code that would make a compiler writer "cringe". People have mentioned the benefit of x(:)= foo to prevent unintended allocation upon assignment, but in the tutorial examples, the arrays were not ALLOCATABLE.

'A(:) = func(B(:), C(:)) with the (:) alerting the reader that the variable is an array (sort of like the Dave Barry joke that an apostrophe serves to warn you that the letter “s” is coming up in grocery store signs.) In Fortran syntax, (:) indicates an array section that starts at the first element and ends at the last element – the whole array, in other words.

Whenever I see this usage, I cringe, because I know that the compiler has to work extra hard to recognize that the programmer really meant “the whole array” and not a piece of it. In the past, unnecessary use of (:) would often prevent optimizations. Nowadays this is less often the case, thanks to hard work by the compiler developers, but sometimes it still happens.'

Beliavsky avatar Apr 16 '21 10:04 Beliavsky

I personally think the extra syntax (:) should not be used, because it's extra noise to read and worry about, exactly as @sblionel wrote. Unfortunately, I also personally think the automatic re-allocation of the LHS was not a good decision in Fortran, because as @arjenmarkus correctly said, it will re-allocate the LHS unless you prevent it by using (:). For performance code, I don't want any kinds of shape checking code introduced by the compiler, citing Steve:

The downside, though, is that the checking required to support this is a lot of extra code, and applications where it is known that the array is already allocated to the correct shape don’t need this check which would just slow them down.

I agree.

Perhaps equally bad, it will re-allocate the array if I made a mistake -- rather I want a nice compiler error message telling me (in Debug mode) that the shape does not match, or that my LHS was not allocated, but in Release mode I want things to run very fast an no checks.

Even worse, imagine code that has no allocatable arrays, as in this tutorial. I strongly believe the syntax v(:) = 9 should not be used, just use v = 9. It looks much better and natural.

Except when then someone changes the array to be allocatable (I do that very often when developing), then the statement A = func(B) becomes problematic, because it might slow things down (due to the compiler now having to check shapes at runtime), and it will hide a bug, instead of giving an error right away that shapes do not match, it will silently re-allocate LHS and continue...

So for my codes, I never depend on automatic reallocation of LHS, and as such I could simply use the -fno-realloc-lhs option and check it at the CI, that way if somebody sends a PR that relies on automatic reallocation, the CI will not pass; So as long as the CI passes, the code itself will run correctly no matter if you use -fno-realloc-lhs or not, so that is nice. And so A(:) = func(B) is exactly the same as A = func(B), which is also nice and so the shorter version without (:) should be used.

However, unfortunately because there is code now written by many people out there that does depend on automatic reallocation of LHS, that code will not work in my project, and would need fixing.

I don't have a good solution...

certik avatar Apr 16 '21 13:04 certik

We (Intel compiler team) were seeing customers increasingly dependent on the reallocation feature. It definitely has its uses, especially in cases where you want to append to the end of an array using syntax such as A = [A,newval]. I don't think the feature was a mistake, and like all new features, compilers can learn how to optimize the check. (I coded the routine that the Intel compiler is currently using for this check, and it is reasonably quick, exiting early if appropriate.)

As noted, you can disable this feature by using A(:) - if you do, please add comments explaining this!

sblionel avatar Apr 16 '21 14:04 sblionel