Kyrill Tkachov
2018-07-17 12:35:15 UTC
Hi all,
This is my first Fortran patch, so apologies if I'm missing something.
The current expansion of the min and max intrinsics explicitly expands
the comparisons between each argument to calculate the global min/max.
Some targets, like aarch64, have instructions that can calculate the min/max
of two real (floating-point) numbers with the proper NaN-handling semantics
(if both inputs are NaN, return Nan. If one is NaN, return the other) and those
are the semantics provided by the __builtin_fmin/max family of functions that expand
to these instructions.
This patch makes the frontend emit __builtin_fmin/max directly to compare each
pair of numbers when the numbers are floating-point, and use MIN_EXPR/MAX_EXPR otherwise
(integral types and -ffast-math) which should hopefully be easier to recognise in the
midend and optimise. The previous approach of generating the open-coded version of that
is used when we don't have an appropriate __builtin_fmin/max available.
For example, for a configuration of x86_64-unknown-linux-gnu that I tested there was no
128-bit __built_fminl available.
With this patch I'm seeing more than 7000 FMINNM/FMAXNM instructions being generated at -O3
on aarch64 for 521.wrf from fprate SPEC2017 where none before were generated
(we were generating explicit comparisons and NaN checks). This gave a 2.4% improvement
in performance on a Cortex-A72.
Bootstrapped and tested on aarch64-none-linux-gnu and x86_64-unknown-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
2018-07-17 Kyrylo Tkachov <***@arm.com>
* f95-lang.c (gfc_init_builtin_functions): Define __builtin_fmin,
__builtin_fminf, __builtin_fminl, __builtin_fmax, __builtin_fmaxf,
__builtin_fmaxl.
* trans-intrinsic.c: Include builtins.h.
(gfc_conv_intrinsic_minmax): Emit __builtin_fmin/max or MIN/MAX_EXPR
functions to calculate the min/max.
2018-07-17 Kyrylo Tkachov <***@arm.com>
* gfortran.dg/max_fmaxf.f90: New test.
* gfortran.dg/min_fminf.f90: Likewise.
* gfortran.dg/minmax_integer.f90: Likewise.
* gfortran.dg/max_fmaxl_aarch64.f90: Likewise.
* gfortran.dg/min_fminl_aarch64.f90: Likewise.
This is my first Fortran patch, so apologies if I'm missing something.
The current expansion of the min and max intrinsics explicitly expands
the comparisons between each argument to calculate the global min/max.
Some targets, like aarch64, have instructions that can calculate the min/max
of two real (floating-point) numbers with the proper NaN-handling semantics
(if both inputs are NaN, return Nan. If one is NaN, return the other) and those
are the semantics provided by the __builtin_fmin/max family of functions that expand
to these instructions.
This patch makes the frontend emit __builtin_fmin/max directly to compare each
pair of numbers when the numbers are floating-point, and use MIN_EXPR/MAX_EXPR otherwise
(integral types and -ffast-math) which should hopefully be easier to recognise in the
midend and optimise. The previous approach of generating the open-coded version of that
is used when we don't have an appropriate __builtin_fmin/max available.
For example, for a configuration of x86_64-unknown-linux-gnu that I tested there was no
128-bit __built_fminl available.
With this patch I'm seeing more than 7000 FMINNM/FMAXNM instructions being generated at -O3
on aarch64 for 521.wrf from fprate SPEC2017 where none before were generated
(we were generating explicit comparisons and NaN checks). This gave a 2.4% improvement
in performance on a Cortex-A72.
Bootstrapped and tested on aarch64-none-linux-gnu and x86_64-unknown-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
2018-07-17 Kyrylo Tkachov <***@arm.com>
* f95-lang.c (gfc_init_builtin_functions): Define __builtin_fmin,
__builtin_fminf, __builtin_fminl, __builtin_fmax, __builtin_fmaxf,
__builtin_fmaxl.
* trans-intrinsic.c: Include builtins.h.
(gfc_conv_intrinsic_minmax): Emit __builtin_fmin/max or MIN/MAX_EXPR
functions to calculate the min/max.
2018-07-17 Kyrylo Tkachov <***@arm.com>
* gfortran.dg/max_fmaxf.f90: New test.
* gfortran.dg/min_fminf.f90: Likewise.
* gfortran.dg/minmax_integer.f90: Likewise.
* gfortran.dg/max_fmaxl_aarch64.f90: Likewise.
* gfortran.dg/min_fminl_aarch64.f90: Likewise.