Discussion:
[Patch, Fortran] PR87597 - fix off-by-one issue with inline matmul
Tobias Burnus
2018-10-12 14:35:01 UTC
Permalink
In the front-end optimization for matmul, we call lbound() for each matrix
argument to obtain the shift to the 0-based loop variables.

If the first argument is a PARAMETER, it appears initially as EXPR_VARIABLE
and has associated ref->u.ar for the AR_FULL and ref->u.ar.as contains the
bounds.

Running gfc_simplify_expr() on the generated lbound() will simplify the
array argument to an EXPR_ARRAY, which always has a lower bound of 1.

The problem starts as soon as the PARAMETER array has bounds which do not
start with 1 but, e.g., with 0 as with the test case: gfortran generates
than code which is off by one (reads the wrong element and beyond array
bounds).

Fixed by explicitly preventing this during gfc_simplify_expr. I hope that's
the only place where matters and that it doesn't cause missed optimizations.

Build and regtested on x86-64-linux.
OK for the trunk and - after a grace time - for GCC 6, 7 and 8?

Tobias
Thomas Koenig
2018-10-12 16:07:39 UTC
Permalink
Hi Tobias,
Post by Tobias Burnus
Build and regtested on x86-64-linux.
OK for the trunk and - after a grace time - for GCC 6, 7 and 8?
Thanks for taking this on so quickly! I don't think that this will
result in any missed optimizations.

Ok.

Thomas
Dominique d'Humières
2018-10-13 11:44:30 UTC
Permalink
Hi Tobias,

UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O0 scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O1 scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O2 scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O3 -g scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -Os scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
IMO this comes from

! { dg-options "-ffrontend-optimize -fdump-tree-original" }

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }

Shouldn’t -fdump-tree-original be -fdump-tree-optimized?

TIA

Dominique
Tobias Burnus
2018-10-13 22:43:53 UTC
Permalink
Hi Dominique,
Post by Dominique d'Humières
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O0 scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }
Shouldn’t -fdump-tree-original be -fdump-tree-optimized?
As it is a front-end optimization (which is explicitly enabled),
-fdump-tree-original should work (and avoids issues with further later
optimizations).

What do you get on your system? Seemingly something else than I do. Can
you look for the gamma5 line in your dump?

Tobias
Dominique d'Humières
2018-10-13 23:27:29 UTC
Permalink
Post by Tobias Burnus
Hi Dominique,
Post by Dominique d'Humières
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O0 scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }
Shouldn’t -fdump-tree-original be -fdump-tree-optimized?
As it is a front-end optimization (which is explicitly enabled), -fdump-tree-original should work (and avoids issues with further later optimizations).
What do you get on your system? Seemingly something else than I do. Can you look for the gamma5 line in your dump?
Tobias
Then

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }

should be

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "original" } }

isn’t it?

see https://gcc.gnu.org/ml/gcc-testresults/2018-10/msg01721.html for non darwin log.

Dominique
Tobias Burnus
2018-10-15 19:02:10 UTC
Permalink
Fixed with commit Rev. 265175 as attached.

Cheers

Tobias
Post by Dominique d'Humières
Post by Tobias Burnus
Post by Dominique d'Humières
UNRESOLVED: gfortran.dg/inline_matmul_24.f90 -O0 scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+ __var_2_do]" 1
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }
Shouldn’t -fdump-tree-original be -fdump-tree-optimized?
As it is a front-end optimization (which is explicitly enabled), -fdump-tree-original should work (and avoids issues with further later optimizations).
What do you get on your system? Seemingly something else than I do. Can you look for the gamma5 line in your dump?
Tobias
Then
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }
should be
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "original" } }
isn’t it?
see https://gcc.gnu.org/ml/gcc-testresults/2018-10/msg01721.html for non darwin log.
Dominique
Tobias Burnus
2018-10-15 19:04:15 UTC
Permalink
Right commit revision, wrong attached file (original patch, not the
follow-up one).
Now hopefully the correct one.

Tobias
Post by Tobias Burnus
Fixed with commit Rev. 265175 as attached.
Cheers
Tobias
Post by Dominique d'Humières
Post by Tobias Burnus
UNRESOLVED: gfortran.dg/inline_matmul_24.f90   -O0
scan-tree-dump-times optimized "gamma5[__var_1_do \\\\* 4 \\\\+
__var_2_do]" 1
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+
__var_2_do\]" 1 "optimized" } }
Shouldn’t -fdump-tree-original be -fdump-tree-optimized?
As it is a front-end optimization (which is explicitly enabled),
-fdump-tree-original should work (and avoids issues with further
later optimizations).
What do you get on your system? Seemingly something else than I do.
Can you look for the gamma5 line in your dump?
Tobias
Then
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+
__var_2_do\]" 1 "optimized" } }
should be
! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+
__var_2_do\]" 1 "original" } }
isn’t it?
see https://gcc.gnu.org/ml/gcc-testresults/2018-10/msg01721.html for non darwin log.
Dominique
Loading...