Discussion:
[PATCH] Create internally nul terminated string literals in fortan FE
Bernd Edlinger
2018-08-01 11:32:43 UTC
Permalink
Hi,

this patch changes the Fortan FE to create NUL terminated STRING_CST
objects. This is a cleanup in preparation of a more thorough check
on the STRING_CST objects in the middle-end.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
Bernd Edlinger
2018-08-08 16:24:14 UTC
Permalink
Hi,

I'd like to ping this patch: https://gcc.gnu.org/ml/fortran/2018-08/msg00000.html

I attach a new version, which contains only a minor white-space change from
the previous version, in the function header of gfc_build_hollerith_string_const
to contain "static tree" on one line instead of two.

Thanks
Bernd.
Post by Bernd Edlinger
Hi,
this patch changes the Fortan FE to create NUL terminated STRING_CST
objects.  This is a cleanup in preparation of a more thorough check
on the STRING_CST objects in the middle-end.
Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?
Thanks
Bernd.
Bernd Edlinger
2018-08-24 20:06:47 UTC
Permalink
Hi!


This is an alternative approach to handle overlength strings in the Fortran FE.

The difference to the previous version is that overlength
STRING_CST never have a longer TREE_STRING_LENGTH than the TYPE_DOMAIN.
And those STRING_CSTs are thus no longer zero terminated.

And the requirement to have all sting constants internally zero-terminated
is dropped.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
Janne Blomqvist
2018-09-03 19:25:24 UTC
Permalink
Post by Bernd Edlinger
Hi!
This is an alternative approach to handle overlength strings in the Fortran FE.
Hi,

can you explain a little more what the problem that this patch tries to
solve is? What is an "overlength" string?
--
Janne Blomqvist
Bernd Edlinger
2018-09-04 07:05:23 UTC
Permalink
Post by Janne Blomqvist
Post by Bernd Edlinger
Hi!
This is an alternative approach to handle overlength strings in the
Fortran FE.
Hi,
can you explain a little more what the problem that this patch tries to
solve is? What is an "overlength" string?
In the middle-end STRING_CST objects have a TYPE_DOMAIN
which specifies how much memory the string constant uses,
and what kind of characters the string constant consists of,
and a TREE_STRING_LENGTH which specifies how many
bytes the string value contains.

Everything is fine, if both sizes agree, or the memory size
is larger than the string length, in which case the string is simply
padded with zero bytes to the full length.

But things get unnecessarily complicated if the memory size
is smaller than the string length.

In this situation we have two different use cases of STRING_CST
which have contradicting rules:

For string literals and flexible arrays the memory size is ignored
and the TREE_STRING_LENGTH is used to specify both the
string length and the memory size. Fortran does not use those.

For STRING_CST used in a CONSTRUCTOR of a string object
the TREE_STRING_LENGTH is ignored, and only the part of the
string value is used that fits into the memory size, the situation
is similar to excess precision floating point values.

Now it happens that the middle-end sees a STRING_CST with
overlength and wants to know if the string constant is properly
zero-terminated, and it is impossible to tell, since any nul byte
at the end of the string value might be part of the ignored excess
precision, but this depends on where the string constant actually
came from.

Therefore I started an effort to sanitize the STRING_CST via
an assertion in the varasm.c where most of the string constants
finally come along, and it triggered in two fortran test cases,
and a few other languages of course.

This is what this patch tries to fix.

Bernd.
Janne Blomqvist
2018-09-05 18:16:03 UTC
Permalink
Post by Bernd Edlinger
On Fri, Aug 24, 2018 at 11:06 PM Bernd Edlinger <
Post by Bernd Edlinger
Hi!
This is an alternative approach to handle overlength strings in the Fortran FE.
Hi,
can you explain a little more what the problem that this patch tries to
solve is? What is an "overlength" string?
In the middle-end STRING_CST objects have a TYPE_DOMAIN
which specifies how much memory the string constant uses,
and what kind of characters the string constant consists of,
and a TREE_STRING_LENGTH which specifies how many
bytes the string value contains.
Everything is fine, if both sizes agree, or the memory size
is larger than the string length, in which case the string is simply
padded with zero bytes to the full length.
But things get unnecessarily complicated if the memory size
is smaller than the string length.
In this situation we have two different use cases of STRING_CST
For string literals and flexible arrays the memory size is ignored
and the TREE_STRING_LENGTH is used to specify both the
string length and the memory size. Fortran does not use those.
For STRING_CST used in a CONSTRUCTOR of a string object
the TREE_STRING_LENGTH is ignored, and only the part of the
string value is used that fits into the memory size, the situation
is similar to excess precision floating point values.
Now it happens that the middle-end sees a STRING_CST with
overlength and wants to know if the string constant is properly
zero-terminated, and it is impossible to tell, since any nul byte
at the end of the string value might be part of the ignored excess
precision, but this depends on where the string constant actually
came from.
Therefore I started an effort to sanitize the STRING_CST via
an assertion in the varasm.c where most of the string constants
finally come along, and it triggered in two fortran test cases,
and a few other languages of course.
This is what this patch tries to fix.
Bernd.
I guess, I'm slightly confused why this mismatch happens in the first place
(does the Fortran frontend do something dumb wrt string declarations, or?),
but, Ok for trunk.
--
Janne Blomqvist
Bernd Edlinger
2018-09-06 11:29:17 UTC
Permalink
Post by Bernd Edlinger
Post by Janne Blomqvist
Post by Bernd Edlinger
Hi!
This is an alternative approach to handle overlength strings in the
Fortran FE.
Hi,
can you explain a little more what the problem that this patch tries to
solve is? What is an "overlength" string?
In the middle-end STRING_CST objects have a TYPE_DOMAIN
which specifies how much memory the string constant uses,
and what kind of characters the string constant consists of,
and a TREE_STRING_LENGTH which specifies how many
bytes the string value contains.
Everything is fine, if both sizes agree, or the memory size
is larger than the string length, in which case the string is simply
padded with zero bytes to the full length.
But things get unnecessarily complicated if the memory size
is smaller than the string length.
In this situation we have two different use cases of STRING_CST
For string literals and flexible arrays the memory size is ignored
and the TREE_STRING_LENGTH is used to specify both the
string length and the memory size.  Fortran does not use those.
For STRING_CST used in a CONSTRUCTOR of a string object
the TREE_STRING_LENGTH is ignored, and only the part of the
string value is used that fits into the memory size, the situation
is similar to excess precision floating point values.
Now it happens that the middle-end sees a STRING_CST with
overlength and wants to know if the string constant is properly
zero-terminated, and it is impossible to tell, since any nul byte
at the end of the string value might be part of the ignored excess
precision, but this depends on where the string constant actually
came from.
Therefore I started an effort to sanitize the STRING_CST via
an assertion in the varasm.c where most of the string constants
finally come along, and it triggered in two fortran test cases,
and a few other languages of course.
This is what this patch tries to fix.
Bernd.
I guess, I'm slightly confused why this mismatch happens in the first place (does the Fortran frontend do something dumb wrt string declarations, or?), but, Ok for trunk.
This is something that happens only on the test case that is mentioned in the comment.
If I remember correctly the string constant is 3 characters long, as well as the
type info on the STRING_CST itself, but the type of the object has only 2 byte
space for the string. Therefore make the string shorter, and use the original type from
the declaration.

I am going to apply this together with the rest of the STRING_CST semantic patches,
once those are

Loading...