Discussion:
[patch, fortran] Inline BLAS calls to support conjg(transpose(a))
Thomas Koenig
2018-09-18 17:46:57 UTC
Permalink
Hello world,

this patch generates direct calls to *GEMM when -fexternal-blas is
specified. This allows to handle arguments to conjugate and transposed
elements, which is quite a common use case.

While looking at the code, I found that the inline limit checks were not
correctly handled for cases except for A2B2. This is also fixed.

In order to check all cases at runtime, I simply copied the reference
BLAS routines to the test cases, why they are *.f instead of *.f90.

Regarding the bounds checking: I added three new test cases, but
as for checking everything, that would be a bit too much. The code
is clear enough that I think the other cases should be OK.

OK for trunk?

Regards

Thomas

2018-09-18 Thomas Koenig <***@gcc.gnu.org>

PR fortran/29550
* gfortran.h (gfc_expr): Add external_blas flag.
* frontend-passes.c (matrix_case): Add case A2TB2T.
(optimize_namespace): Handle flag_external_blas by
calling call_external_blas.
(get_array_inq_function): Add argument okind. If
it is nonzero, use it as the kind of argument
to be used.
(inline_limit_check): Remove m_case argument, add
limit argument instead. Remove assert about m_case.
Set the limit for inlining from the limit argument.
(matmul_lhs_realloc): Handle case A2TB2T.
(inline_matmul_assign): Handle inline limit for other cases with
two rank-two matrices. Remove no-op calls to inline_limit_check.
(call_external_blas): New function.
* trans-intrinsic.c (gfc_conv_intrinsic_funcall): Do not add
argument to external BLAS if external_blas is already set.

2018-09-18 Thomas Koenig <***@gcc.gnu.org>

PR fortran/29550
* gfortran.dg/inline_matmul_13.f90: Adjust count for
_gfortran_matmul.
* gfortran.dg/inline_matmul_16.f90: Likewise.
* gfortran.dg/promotion_2.f90: Add -fblas-matmul-limit=1. Scan
for dgemm instead of dgemm_. Add call to random_number to make
standard conforming.
* gfortran.dg/matmul_blas_1.f90: New test.
* gfortran.dg/matmul_bounds_14.f: New test.
* gfortran.dg/matmul_bounds_15.f: New test.
* gfortran.dg/matmul_bounds_16.f: New test.
Paul Richard Thomas
2018-09-18 18:20:08 UTC
Permalink
Hi Thomas,

This fine, except for one niggle. Rather than having the blas source
in each testcase, why don't you put all the functions in a blas file
and use the dg-additional-sources mechanism?

Cheers

Paul
Post by Thomas Koenig
Hello world,
this patch generates direct calls to *GEMM when -fexternal-blas is
specified. This allows to handle arguments to conjugate and transposed
elements, which is quite a common use case.
While looking at the code, I found that the inline limit checks were not
correctly handled for cases except for A2B2. This is also fixed.
In order to check all cases at runtime, I simply copied the reference
BLAS routines to the test cases, why they are *.f instead of *.f90.
Regarding the bounds checking: I added three new test cases, but
as for checking everything, that would be a bit too much. The code
is clear enough that I think the other cases should be OK.
OK for trunk?
Regards
Thomas
PR fortran/29550
* gfortran.h (gfc_expr): Add external_blas flag.
* frontend-passes.c (matrix_case): Add case A2TB2T.
(optimize_namespace): Handle flag_external_blas by
calling call_external_blas.
(get_array_inq_function): Add argument okind. If
it is nonzero, use it as the kind of argument
to be used.
(inline_limit_check): Remove m_case argument, add
limit argument instead. Remove assert about m_case.
Set the limit for inlining from the limit argument.
(matmul_lhs_realloc): Handle case A2TB2T.
(inline_matmul_assign): Handle inline limit for other cases with
two rank-two matrices. Remove no-op calls to inline_limit_check.
(call_external_blas): New function.
* trans-intrinsic.c (gfc_conv_intrinsic_funcall): Do not add
argument to external BLAS if external_blas is already set.
PR fortran/29550
* gfortran.dg/inline_matmul_13.f90: Adjust count for
_gfortran_matmul.
* gfortran.dg/inline_matmul_16.f90: Likewise.
* gfortran.dg/promotion_2.f90: Add -fblas-matmul-limit=1. Scan
for dgemm instead of dgemm_. Add call to random_number to make
standard conforming.
* gfortran.dg/matmul_blas_1.f90: New test.
* gfortran.dg/matmul_bounds_14.f: New test.
* gfortran.dg/matmul_bounds_15.f: New test.
* gfortran.dg/matmul_bounds_16.f: New test.
--
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein
Thomas Koenig
2018-09-18 20:00:46 UTC
Permalink
Hi Paul,
Post by Paul Richard Thomas
This fine, except for one niggle. Rather than having the blas source
in each testcase, why don't you put all the functions in a blas file
and use the dg-additional-sources mechanism?
Good idea, I have added this. Committed as r264411.

Regards

Thomas
Dominique d'Humières
2018-09-18 20:06:48 UTC
Permalink
Post by Thomas Koenig
Committed as r264411.
Nope!-(

Dominique
Thomas Koenig
2018-09-18 20:19:21 UTC
Permalink
Hi Dominique,
Post by Dominique d'Humières
Post by Thomas Koenig
Committed as r264411.
Nope!-(
Well, I made r264412.

Admittedly, the ChangeLogs only would have had a limited effect :-)

Thanks for the heads-up!

Thomas
Bernhard Reutner-Fischer
2018-09-19 18:11:38 UTC
Permalink
On Tue, 18 Sep 2018 at 22:19, Thomas Koenig <***@netcologne.de> wrote:

s/mamtul/matmul/

thanks,

Loading...