Thomas Schwinge
2018-11-07 19:13:29 UTC
Hi Chung-Lin!
<https://gcc.gnu.org/PR87924> "OpenACC wait clauses without
async-arguments". (I couldn't put you in CC because "***@gcc.gnu.org
did not match anything"?)
this email?)
No test cases included. I'm working on a few, will post/commit later.
similar to Fortran's OpenACC async clause handling without explicit
async-argument:
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -1885,7 +1885,19 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
break;
}
else if (m == MATCH_NO)
- needs_space = true;
+ {
+ gfc_expr *expr
+ = gfc_get_constant_expr (BT_INTEGER,
+ gfc_default_integer_kind,
+ &gfc_current_locus);
+ mpz_set_si (expr->value.integer, GOMP_ASYNC_NOVAL);
+ gfc_expr_list **expr_list = &c->wait_list;
+ while (*expr_list)
+ expr_list = &(*expr_list)->next;
+ *expr_list = gfc_get_expr_list ();
+ (*expr_list)->expr = expr;
+ needs_space = true;
+ }
continue;
}
if ((mask & OMP_CLAUSE_WORKER)
Now, why do we need the following changes, in this rather "convoluted"
async-argument (that is, map a single "wait" clause to "num_waits == 1,
*ap == GOMP_ASYNC_NOVAL"), and then handle that case in "goacc_wait",
avoiding all these interface changes and special casing in different
functions?
Or am I not understanding correctly what the purpose of this is?
My understanding is that before, GCC never generates "negative
async-arguments" (now used for "GOMP_ASYNC_NOVAL"), but only non-negative
ones (real "async-arguments"), which we continue to handle, as before.
Isn't that sufficient for the ABI compatibility that we promise, which is
(unless I'm confused now?) that old (existing) executables continue to
run correctly when dynamically linking against a new libgomp. Or do we
also have to care about the case that an executable built with a new
version of GCC has to work when dynamically linked against an old
libgomp?
Grüße
Thomas
Hi, this patch properly handles OpenACC 'wait' clauses without arguments, making it an equivalent of "wait all".
Thanks!(current trunk basically discards and ignores such argument-less wait
clauses)
Bugs should be filed, for later reference. Now done:clauses)
<https://gcc.gnu.org/PR87924> "OpenACC wait clauses without
async-arguments". (I couldn't put you in CC because "***@gcc.gnu.org
did not match anything"?)
This adds additional handling in
the pack/unpack of the wait argument across the compiler/libgomp interface, but is done in a matter that
doesn't affect binary compatibility.
Hmm. See below. (Jakub, could you please review the last paragraph ofthe pack/unpack of the wait argument across the compiler/libgomp interface, but is done in a matter that
doesn't affect binary compatibility.
this email?)
This patch was part of the OpenACC async re-work that was done on the gomp4 branch (later merged to OG7/OG8), see [1].
I'm separating this part out and submitting it first because it's logically independent.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01842.html
Thanks for splitting it out!I'm separating this part out and submitting it first because it's logically independent.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01842.html
Re-tested with offloading to ensure no regressions, is this okay for trunk?
A few comments.No test cases included. I'm working on a few, will post/commit later.
gcc/c/
* c-parser.c (c_parser_oacc_clause_wait): Add representation of wait
clause without argument as 'wait (GOMP_ASYNC_NOVAL)', adjust comments.
gcc/cp/
* parser.c (cp_parser_oacc_clause_wait): Add representation of wait
clause without argument as 'wait (GOMP_ASYNC_NOVAL)', adjust comments.
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses_1): Add representation of wait
clause without argument as 'wait (GOMP_ASYNC_NOVAL)'.
gcc/
* omp-low.c (expand_omp_target): Add middle-end support for handling
OMP_CLAUSE_WAIT clause with a GOMP_ASYNC_NOVAL(-1) as the argument.
include/
* gomp-constants.h (GOMP_LAUNCH_OP_MASK): Define.
(GOMP_LAUNCH_PACK): Add bitwise-and of GOMP_LAUNCH_OP_MASK.
(GOMP_LAUNCH_OP): Likewise.
libgomp/
* oacc-parallel.c (GOACC_parallel_keyed): Interpret launch op as
signed 16-bit field, adjust num_waits handling.
(GOACC_enter_exit_data): Adjust num_waits handling.
(GOACC_update): Adjust num_waits handling.
--- gcc/c/c-parser.c (revision 263981)
+++ gcc/c/c-parser.c (working copy)
@@ -12719,7 +12719,7 @@ c_parser_oacc_clause_tile (c_parser *parser, tree
}
- wait ( int-expr-list ) */
+ wait [( int-expr-list )] */
static tree
c_parser_oacc_clause_wait (c_parser *parser, tree list)
@@ -12728,7 +12728,15 @@ c_parser_oacc_clause_wait (c_parser *parser, tree
if (c_parser_peek_token (parser)->type == CPP_OPEN_PAREN)
list = c_parser_oacc_wait_list (parser, clause_loc, list);
+ else
+ {
+ tree c = build_omp_clause (clause_loc, OMP_CLAUSE_WAIT);
+ OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+ OMP_CLAUSE_CHAIN (c) = list;
+ list = c;
+ }
+
return list;
}
ACK.* c-parser.c (c_parser_oacc_clause_wait): Add representation of wait
clause without argument as 'wait (GOMP_ASYNC_NOVAL)', adjust comments.
gcc/cp/
* parser.c (cp_parser_oacc_clause_wait): Add representation of wait
clause without argument as 'wait (GOMP_ASYNC_NOVAL)', adjust comments.
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses_1): Add representation of wait
clause without argument as 'wait (GOMP_ASYNC_NOVAL)'.
gcc/
* omp-low.c (expand_omp_target): Add middle-end support for handling
OMP_CLAUSE_WAIT clause with a GOMP_ASYNC_NOVAL(-1) as the argument.
include/
* gomp-constants.h (GOMP_LAUNCH_OP_MASK): Define.
(GOMP_LAUNCH_PACK): Add bitwise-and of GOMP_LAUNCH_OP_MASK.
(GOMP_LAUNCH_OP): Likewise.
libgomp/
* oacc-parallel.c (GOACC_parallel_keyed): Interpret launch op as
signed 16-bit field, adjust num_waits handling.
(GOACC_enter_exit_data): Adjust num_waits handling.
(GOACC_update): Adjust num_waits handling.
--- gcc/c/c-parser.c (revision 263981)
+++ gcc/c/c-parser.c (working copy)
@@ -12719,7 +12719,7 @@ c_parser_oacc_clause_tile (c_parser *parser, tree
}
- wait ( int-expr-list ) */
+ wait [( int-expr-list )] */
static tree
c_parser_oacc_clause_wait (c_parser *parser, tree list)
@@ -12728,7 +12728,15 @@ c_parser_oacc_clause_wait (c_parser *parser, tree
if (c_parser_peek_token (parser)->type == CPP_OPEN_PAREN)
list = c_parser_oacc_wait_list (parser, clause_loc, list);
+ else
+ {
+ tree c = build_omp_clause (clause_loc, OMP_CLAUSE_WAIT);
+ OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+ OMP_CLAUSE_CHAIN (c) = list;
+ list = c;
+ }
+
return list;
}
--- gcc/cp/parser.c (revision 263981)
+++ gcc/cp/parser.c (working copy)
@@ -32137,7 +32137,7 @@ cp_parser_oacc_wait_list (cp_parser *parser, locat
}
- wait ( int-expr-list ) */
+ wait [( int-expr-list )] */
static tree
cp_parser_oacc_clause_wait (cp_parser *parser, tree list)
@@ -32144,10 +32144,16 @@ cp_parser_oacc_clause_wait (cp_parser *parser, tre
{
location_t location = cp_lexer_peek_token (parser->lexer)->location;
- if (cp_lexer_peek_token (parser->lexer)->type != CPP_OPEN_PAREN)
- return list;
+ if (cp_lexer_peek_token (parser->lexer)->type == CPP_OPEN_PAREN)
+ list = cp_parser_oacc_wait_list (parser, location, list);
+ else
+ {
+ tree c = build_omp_clause (location, OMP_CLAUSE_WAIT);
- list = cp_parser_oacc_wait_list (parser, location, list);
+ OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+ OMP_CLAUSE_CHAIN (c) = list;
+ list = c;
+ }
return list;
}
ACK.+++ gcc/cp/parser.c (working copy)
@@ -32137,7 +32137,7 @@ cp_parser_oacc_wait_list (cp_parser *parser, locat
}
- wait ( int-expr-list ) */
+ wait [( int-expr-list )] */
static tree
cp_parser_oacc_clause_wait (cp_parser *parser, tree list)
@@ -32144,10 +32144,16 @@ cp_parser_oacc_clause_wait (cp_parser *parser, tre
{
location_t location = cp_lexer_peek_token (parser->lexer)->location;
- if (cp_lexer_peek_token (parser->lexer)->type != CPP_OPEN_PAREN)
- return list;
+ if (cp_lexer_peek_token (parser->lexer)->type == CPP_OPEN_PAREN)
+ list = cp_parser_oacc_wait_list (parser, location, list);
+ else
+ {
+ tree c = build_omp_clause (location, OMP_CLAUSE_WAIT);
- list = cp_parser_oacc_wait_list (parser, location, list);
+ OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+ OMP_CLAUSE_CHAIN (c) = list;
+ list = c;
+ }
return list;
}
--- gcc/fortran/trans-openmp.c (revision 263981)
+++ gcc/fortran/trans-openmp.c (working copy)
@@ -2922,6 +2922,13 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp
omp_clauses = c;
}
}
+ else if (clauses->wait)
+ {
+ c = build_omp_clause (where.lb->location, OMP_CLAUSE_WAIT);
+ OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+ OMP_CLAUSE_CHAIN (c) = omp_clauses;
+ omp_clauses = c;
+ }
if (clauses->num_gangs_expr)
{
tree num_gangs_var
NACK. Instead let's do the following, similar to C, C++, and also+++ gcc/fortran/trans-openmp.c (working copy)
@@ -2922,6 +2922,13 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp
omp_clauses = c;
}
}
+ else if (clauses->wait)
+ {
+ c = build_omp_clause (where.lb->location, OMP_CLAUSE_WAIT);
+ OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+ OMP_CLAUSE_CHAIN (c) = omp_clauses;
+ omp_clauses = c;
+ }
if (clauses->num_gangs_expr)
{
tree num_gangs_var
similar to Fortran's OpenACC async clause handling without explicit
async-argument:
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -1885,7 +1885,19 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
break;
}
else if (m == MATCH_NO)
- needs_space = true;
+ {
+ gfc_expr *expr
+ = gfc_get_constant_expr (BT_INTEGER,
+ gfc_default_integer_kind,
+ &gfc_current_locus);
+ mpz_set_si (expr->value.integer, GOMP_ASYNC_NOVAL);
+ gfc_expr_list **expr_list = &c->wait_list;
+ while (*expr_list)
+ expr_list = &(*expr_list)->next;
+ *expr_list = gfc_get_expr_list ();
+ (*expr_list)->expr = expr;
+ needs_space = true;
+ }
continue;
}
if ((mask & OMP_CLAUSE_WORKER)
Now, why do we need the following changes, in this rather "convoluted"
--- gcc/omp-expand.c (revision 263981)
+++ gcc/omp-expand.c (working copy)
@@ -7381,16 +7381,32 @@ expand_omp_target (struct omp_region *region)
/* ... push a placeholder. */
args.safe_push (integer_zero_node);
+ bool noval_seen = false;
+ tree noval = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+
for (; c; c = OMP_CLAUSE_CHAIN (c))
if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_WAIT)
{
+ tree wait_expr = OMP_CLAUSE_WAIT_EXPR (c);
+
+ if (TREE_CODE (wait_expr) == INTEGER_CST
+ && tree_int_cst_compare (wait_expr, noval) == 0)
+ {
+ noval_seen = true;
+ continue;
+ }
+
args.safe_push (fold_convert_loc (OMP_CLAUSE_LOCATION (c),
- integer_type_node,
- OMP_CLAUSE_WAIT_EXPR (c)));
+ integer_type_node, wait_expr));
num_waits++;
}
- if (!tagging || num_waits)
+ if (noval_seen && num_waits == 0)
+ args[t_wait_idx] =
+ (tagging
+ ? oacc_launch_pack (GOMP_LAUNCH_WAIT, NULL_TREE, GOMP_ASYNC_NOVAL)
+ : noval);
+ else if (!tagging || num_waits)
{
tree len;
--- include/gomp-constants.h (revision 263981)
+++ include/gomp-constants.h (working copy)
@@ -221,13 +221,14 @@ enum gomp_map_kind
#define GOMP_LAUNCH_CODE_SHIFT 28
#define GOMP_LAUNCH_DEVICE_SHIFT 16
#define GOMP_LAUNCH_OP_SHIFT 0
+#define GOMP_LAUNCH_OP_MASK 0xffff
#define GOMP_LAUNCH_PACK(CODE,DEVICE,OP) \
(((CODE) << GOMP_LAUNCH_CODE_SHIFT) \
| ((DEVICE) << GOMP_LAUNCH_DEVICE_SHIFT) \
- | ((OP) << GOMP_LAUNCH_OP_SHIFT))
+ | (((OP) & GOMP_LAUNCH_OP_MASK) << GOMP_LAUNCH_OP_SHIFT))
#define GOMP_LAUNCH_CODE(X) (((X) >> GOMP_LAUNCH_CODE_SHIFT) & 0xf)
#define GOMP_LAUNCH_DEVICE(X) (((X) >> GOMP_LAUNCH_DEVICE_SHIFT) & 0xfff)
-#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & 0xffff)
+#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & GOMP_LAUNCH_OP_MASK)
#define GOMP_LAUNCH_OP_MAX 0xffff
/* Bitmask to apply in order to find out the intended device of a target
--- libgomp/oacc-parallel.c (revision 263981)
+++ libgomp/oacc-parallel.c (working copy)
@@ -194,10 +194,14 @@ GOACC_parallel_keyed (int device, void (*fn) (void
{
- unsigned num_waits = GOMP_LAUNCH_OP (tag);
+ /* Be careful to cast the op field as a signed 16-bit, and
+ sign-extend to full integer. */
+ int num_waits = ((signed short) GOMP_LAUNCH_OP (tag));
- if (num_waits)
+ if (num_waits > 0)
goacc_wait (async, num_waits, &ap);
+ else if (num_waits == acc_async_noval)
+ acc_wait_all_async (async);
break;
}
@@ -351,7 +355,7 @@ GOACC_enter_exit_data (int device, size_t mapnum,
|| host_fallback)
return;
- if (num_waits)
+ if (num_waits > 0)
{
va_list ap;
@@ -359,6 +363,8 @@ GOACC_enter_exit_data (int device, size_t mapnum,
goacc_wait (async, num_waits, &ap);
va_end (ap);
}
+ else if (num_waits == acc_async_noval)
+ acc_wait_all_async (async);
/* Determine whether "finalize" semantics apply to all mappings of this
OpenACC directive. */
@@ -542,7 +548,7 @@ GOACC_update (int device, size_t mapnum,
|| host_fallback)
return;
- if (num_waits)
+ if (num_waits > 0)
{
va_list ap;
@@ -550,6 +556,8 @@ GOACC_update (int device, size_t mapnum,
goacc_wait (async, num_waits, &ap);
va_end (ap);
}
+ else if (num_waits == acc_async_noval)
+ acc_wait_all_async (async);
acc_dev->openacc.async_set_async_func (async);
Why can't we just pass "GOMP_ASYNC_NOVAL" through like any other+++ gcc/omp-expand.c (working copy)
@@ -7381,16 +7381,32 @@ expand_omp_target (struct omp_region *region)
/* ... push a placeholder. */
args.safe_push (integer_zero_node);
+ bool noval_seen = false;
+ tree noval = build_int_cst (integer_type_node, GOMP_ASYNC_NOVAL);
+
for (; c; c = OMP_CLAUSE_CHAIN (c))
if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_WAIT)
{
+ tree wait_expr = OMP_CLAUSE_WAIT_EXPR (c);
+
+ if (TREE_CODE (wait_expr) == INTEGER_CST
+ && tree_int_cst_compare (wait_expr, noval) == 0)
+ {
+ noval_seen = true;
+ continue;
+ }
+
args.safe_push (fold_convert_loc (OMP_CLAUSE_LOCATION (c),
- integer_type_node,
- OMP_CLAUSE_WAIT_EXPR (c)));
+ integer_type_node, wait_expr));
num_waits++;
}
- if (!tagging || num_waits)
+ if (noval_seen && num_waits == 0)
+ args[t_wait_idx] =
+ (tagging
+ ? oacc_launch_pack (GOMP_LAUNCH_WAIT, NULL_TREE, GOMP_ASYNC_NOVAL)
+ : noval);
+ else if (!tagging || num_waits)
{
tree len;
--- include/gomp-constants.h (revision 263981)
+++ include/gomp-constants.h (working copy)
@@ -221,13 +221,14 @@ enum gomp_map_kind
#define GOMP_LAUNCH_CODE_SHIFT 28
#define GOMP_LAUNCH_DEVICE_SHIFT 16
#define GOMP_LAUNCH_OP_SHIFT 0
+#define GOMP_LAUNCH_OP_MASK 0xffff
#define GOMP_LAUNCH_PACK(CODE,DEVICE,OP) \
(((CODE) << GOMP_LAUNCH_CODE_SHIFT) \
| ((DEVICE) << GOMP_LAUNCH_DEVICE_SHIFT) \
- | ((OP) << GOMP_LAUNCH_OP_SHIFT))
+ | (((OP) & GOMP_LAUNCH_OP_MASK) << GOMP_LAUNCH_OP_SHIFT))
#define GOMP_LAUNCH_CODE(X) (((X) >> GOMP_LAUNCH_CODE_SHIFT) & 0xf)
#define GOMP_LAUNCH_DEVICE(X) (((X) >> GOMP_LAUNCH_DEVICE_SHIFT) & 0xfff)
-#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & 0xffff)
+#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & GOMP_LAUNCH_OP_MASK)
#define GOMP_LAUNCH_OP_MAX 0xffff
/* Bitmask to apply in order to find out the intended device of a target
--- libgomp/oacc-parallel.c (revision 263981)
+++ libgomp/oacc-parallel.c (working copy)
@@ -194,10 +194,14 @@ GOACC_parallel_keyed (int device, void (*fn) (void
{
- unsigned num_waits = GOMP_LAUNCH_OP (tag);
+ /* Be careful to cast the op field as a signed 16-bit, and
+ sign-extend to full integer. */
+ int num_waits = ((signed short) GOMP_LAUNCH_OP (tag));
- if (num_waits)
+ if (num_waits > 0)
goacc_wait (async, num_waits, &ap);
+ else if (num_waits == acc_async_noval)
+ acc_wait_all_async (async);
break;
}
@@ -351,7 +355,7 @@ GOACC_enter_exit_data (int device, size_t mapnum,
|| host_fallback)
return;
- if (num_waits)
+ if (num_waits > 0)
{
va_list ap;
@@ -359,6 +363,8 @@ GOACC_enter_exit_data (int device, size_t mapnum,
goacc_wait (async, num_waits, &ap);
va_end (ap);
}
+ else if (num_waits == acc_async_noval)
+ acc_wait_all_async (async);
/* Determine whether "finalize" semantics apply to all mappings of this
OpenACC directive. */
@@ -542,7 +548,7 @@ GOACC_update (int device, size_t mapnum,
|| host_fallback)
return;
- if (num_waits)
+ if (num_waits > 0)
{
va_list ap;
@@ -550,6 +556,8 @@ GOACC_update (int device, size_t mapnum,
goacc_wait (async, num_waits, &ap);
va_end (ap);
}
+ else if (num_waits == acc_async_noval)
+ acc_wait_all_async (async);
acc_dev->openacc.async_set_async_func (async);
async-argument (that is, map a single "wait" clause to "num_waits == 1,
*ap == GOMP_ASYNC_NOVAL"), and then handle that case in "goacc_wait",
avoiding all these interface changes and special casing in different
functions?
Or am I not understanding correctly what the purpose of this is?
My understanding is that before, GCC never generates "negative
async-arguments" (now used for "GOMP_ASYNC_NOVAL"), but only non-negative
ones (real "async-arguments"), which we continue to handle, as before.
Isn't that sufficient for the ABI compatibility that we promise, which is
(unless I'm confused now?) that old (existing) executables continue to
run correctly when dynamically linking against a new libgomp. Or do we
also have to care about the case that an executable built with a new
version of GCC has to work when dynamically linked against an old
libgomp?
Grüße
Thomas