Martin Liška
2018-10-29 13:49:24 UTC
Hi.
Unlike the C and C++ front-ends, GNU Fortran does not know about
vector implementations of math routines provided by GLIBC. This
prevents vectorization of many loops which is a frequent cause of
performance that is worse than compilers with their own math library
such as ICC.
The purpose of the patch is to provide a mechanism that will tell Fotran FE
which intrinsics have simd cloned in math library.
I've been cooperating with Paul and we came to a proof-of-concept that consists
of 2 parts (patches):
The first patch adds support for inclusion a module via
command line. The module will be provided by glibc in order to synchronize which functions
are provided by a glibc version. Paul is suggesting to maybe include that into load machinery
of IEEE module.
Second patch propagates information about newly introduced attribute simd_notinbranch into
gfc_intrinsic_sym. That is later used to modify a corresponding __bultin_*.
Definition of the intrinsics module and it's usage can look like:
cat ~/Programming/testcases/use.f90
module overload
interface
function sin(arg)
!GCC$ attributes simd_notinbranch :: sin
real, intent(in) :: arg
real :: sin
end function sin
end interface
end module
program test_overloaded_intrinsic
real(4) :: x(3200), y(3200), z(3200)
! this should be using simd clone
y = sin(x)
print *, y
! this not
z = cos(x)
print *, z
end
Then using my patches one can see:
$ ./xgcc -B. ~/Programming/testcases/use.f90 -c -Ofast -fdump-tree-optimized=/dev/stdout -c
;; Function test_overloaded_intrinsic (MAIN__, funcdef_no=0, decl_uid=3815, cgraph_uid=1, symbol_order=0) (executed once)
test_overloaded_intrinsic ()
{
...
vect__3.14_58 = sinf.simdclone.0 (vect__2.13_60);
MEM[symbol: y, index: ivtmp.35_47, offset: 0B] = vect__3.14_58;
...
_6 = __builtin_cosf (_5);
MEM[symbol: z, index: ivtmp.30_46, offset: 0B] = _6;
...
}
That's what I have. I would like to ask Fortran folks about their opinion? I know
the part in gfc_match_gcc_attributes is bit tricky, but apart from that the rest
should be well formed.
Thoughts?
Thanks,
Martin
Unlike the C and C++ front-ends, GNU Fortran does not know about
vector implementations of math routines provided by GLIBC. This
prevents vectorization of many loops which is a frequent cause of
performance that is worse than compilers with their own math library
such as ICC.
The purpose of the patch is to provide a mechanism that will tell Fotran FE
which intrinsics have simd cloned in math library.
I've been cooperating with Paul and we came to a proof-of-concept that consists
of 2 parts (patches):
The first patch adds support for inclusion a module via
command line. The module will be provided by glibc in order to synchronize which functions
are provided by a glibc version. Paul is suggesting to maybe include that into load machinery
of IEEE module.
Second patch propagates information about newly introduced attribute simd_notinbranch into
gfc_intrinsic_sym. That is later used to modify a corresponding __bultin_*.
Definition of the intrinsics module and it's usage can look like:
cat ~/Programming/testcases/use.f90
module overload
interface
function sin(arg)
!GCC$ attributes simd_notinbranch :: sin
real, intent(in) :: arg
real :: sin
end function sin
end interface
end module
program test_overloaded_intrinsic
real(4) :: x(3200), y(3200), z(3200)
! this should be using simd clone
y = sin(x)
print *, y
! this not
z = cos(x)
print *, z
end
Then using my patches one can see:
$ ./xgcc -B. ~/Programming/testcases/use.f90 -c -Ofast -fdump-tree-optimized=/dev/stdout -c
;; Function test_overloaded_intrinsic (MAIN__, funcdef_no=0, decl_uid=3815, cgraph_uid=1, symbol_order=0) (executed once)
test_overloaded_intrinsic ()
{
...
vect__3.14_58 = sinf.simdclone.0 (vect__2.13_60);
MEM[symbol: y, index: ivtmp.35_47, offset: 0B] = vect__3.14_58;
...
_6 = __builtin_cosf (_5);
MEM[symbol: z, index: ivtmp.30_46, offset: 0B] = _6;
...
}
That's what I have. I would like to ask Fortran folks about their opinion? I know
the part in gfc_match_gcc_attributes is bit tricky, but apart from that the rest
should be well formed.
Thoughts?
Thanks,
Martin