www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 10750] New: Strict aliasing semantics

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10750

           Summary: Strict aliasing semantics
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: code klickverbot.at


--- Comment #0 from David Nadlinger <code klickverbot.at> 2013-08-03 04:06:41
PDT ---
From a discussion on dm.D
(http://forum.dlang.org/post/kt026a$256e$1 digitalmars.com):

–––
On Saturday, 27 July 2013 at 08:59:54 UTC, Walter Bright wrote:
 On 7/27/2013 1:57 AM, David Nadlinger wrote:
 On Saturday, 27 July 2013 at 06:58:04 UTC, Walter Bright wrote:
 Although it isn't in the spec, D should be "strict aliasing". 
 This is because:

 1. it enables better code generation

 2. there are ways, such as unions, to get the other aliasing 
 that doesn't
 break strict aliasing
We need to carefully formalize this then, and quickly. The problem GCC, Clang and others are facing is that (as you are probably aware) 2. isn't guaranteed to work for type-casting pointers either by the specs, but people want to be able to do this nonetheless. Thus, they both accept pointer aliasing through union types, trying to optimize as much as possible while avoiding to break people's expectations and existing code. This is a very unfortunate situation for both compiler developers and users; just search for something like "gcc strict aliasing" on StackOverflow for examples. There is already quite a lot of D code out there that violates the C-style strict aliasing rules.
I agree. Want to do an enhancement request on bugzilla for it?
––– -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 03 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10750


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs eml.cc


--- Comment #1 from bearophile_hugs eml.cc 2013-08-03 04:37:51 PDT ---
(In reply to comment #0)

 (http://forum.dlang.org/post/kt026a$256e$1 digitalmars.com):
My comments was: Is it good to add to Phobos a small template (named like "PointerCast" or something similar) that uses a union internally to perform pointer type conversions? Is then the compiler going to warn the programmer when the pointer type aliasing rule is violated? I mean when the D code uses cast() between different pointer types (beside constness). An alternative design is to even deprecate (and later turn those into errors, where the error message suggests to use PointerCast). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 03 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10750


Johannes Pfau <johannespfau gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |johannespfau gmail.com


--- Comment #2 from Johannes Pfau <johannespfau gmail.com> 2013-11-02 13:39:44
PDT ---
Such a PointerCast is not safe in all cases when compiling with GDC as even
unions are not an exception to strict aliasing rules:
http://stackoverflow.com/questions/2906365/gcc-strict-aliasing-and-casting-through-a-union

I'm not sure if it's possible to change this in the GDC frontend.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 02 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10750



--- Comment #3 from Johannes Pfau <johannespfau gmail.com> 2013-11-03 02:13:39
PST ---
 bearophile:
To further expand on this:
http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html
says:
"type-punning is allowed, provided the memory is accessed through the union
type. [...] access by taking the address, casting the resulting pointer and
dereferencing the result has undefined behavior, even if the cast uses a union
type, e.g.: "

 David
What would a safe cast with strict pointer aliasing look like?

First some background information on how aliasing is implemented in gcc
(alias.c): Every type is assigned an alias set. The alias set is only a unique
id + a flattened list of the uids of all 'member types'. For example, this
struct:
----------------------
struct B
{
    char member;
}
struct A
{
   int member1, float member2;
   B member3;
}
----------------------

will generate this alias set:
uid=1, children={2(int),3(float),4(char)}

Then for code like this:
----------------------
A instance;
instance.member1 = 0;
A copy = a;
----------------------

The compiler now inspects the line instance.member1 = 0; and assigns alias set
2(int) to it. Line 3 has alias set 1(B). When gcc now schedules instructions it
checks if set 2 conflicts with set 1 by checking: (set1 == set2 || set1 in
set2.children || set2 in set1.children). If they don't conflict gcc reorders
instructions.


This explains the problems with type punning:
----------------------
int a = 3;                    //alias set 0(int), children = {}
int b = a;                    //alias set 0(int), children = {}
*(cast(float*)&a) = 3.0f;     //alias set 1(float), children = {}
----------------------
as you can see these types don't conflict and gcc may reorder line 2 and 3.
Access through unions now solves this problem as the alias set for a union
would include both {float, int} as children.

But as for as I understand these strict alising rules make it impossible to
safely cast from one pointer type to another. Only _access_ through unions will
work.

As an example:

----------------------
T* safeCast(T, U)(U* input)
{
    union wrap
    {
        U inp;
        T outp;
    }

    return &(cast(wrap*)input).outp;
}

void withFloat(float* f)
{
    *f = 0.1f;
}

int b;
void withInt(int* i)
{
    b = *i;
}

void main()
{
    int x = 0;
    auto asFloat = (safeCast!float(&x));
    withFloat(asFloat)
    withInt(&i);
}
----------------------

now with optimizations (inlining)
------------------------------------
union wrap
{
    int inp;
    float outp;
}

int b;
void main()
{
    int x = 0;                            //alias set: int
    auto asFloat = (&(cast(wrap*)x).outp) //alias set: wrap (but noop)
    *asFloat = 0.1f;                      //alias set: float
    b = x;                                //alias set: int
}
------------------------------------
I know from unfortunate experienc, that gcc may even completely discard the
"auto asFloat" line. But even if it didn't, "*asFloat = 0.1f;" and "b = x;" can
be reordered according to strict aliasing rules. If "auto asFloat" is
discarded, even "int x = 0;" and "*asFloat = 0.1f;" may be reordered.


So to summarize this: I don't know how you could make a safe cast from T* to U*
assuming strict aliasing rules. Unions are only safe if all access goes through
unions, but that is not possible when dealing with 3rd party functions. (Assume
you can't change withFloat, withInt).

We had problems with this in GDC right now on ARM (std.algorithm.find uses
cast(ubyte[])string which internally translates to invalid pointer aliasing)
and as a result we'll now have to disable strict aliasing in the GCC backend.

I think type based aliasing, even if it may provide some optimization benefits,
is in general a horrible idea.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 03 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10750


Iain Buclaw <ibuclaw ubuntu.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ibuclaw ubuntu.com


--- Comment #4 from Iain Buclaw <ibuclaw ubuntu.com> 2013-11-03 03:17:17 PST ---
(In reply to comment #3)
 We had problems with this in GDC right now on ARM (std.algorithm.find uses
 cast(ubyte[])string which internally translates to invalid pointer aliasing)
 and as a result we'll now have to disable strict aliasing in the GCC backend.
 
Which is a shame, because dynamic arrays are perhaps the one type in D that should instead benefit from strict aliasing rules... -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10750



--- Comment #5 from Iain Buclaw <ibuclaw ubuntu.com> 2013-11-03 03:43:06 PST ---
(In reply to comment #4)
 (In reply to comment #3)
 We had problems with this in GDC right now on ARM (std.algorithm.find uses
 cast(ubyte[])string which internally translates to invalid pointer aliasing)
 and as a result we'll now have to disable strict aliasing in the GCC backend.
 
Which is a shame, because dynamic arrays are perhaps the one type in D that should instead benefit from strict aliasing rules...
Alternatively, we can just define better aliasing rules that better suit D. ie: - Permit type-punning when accessing through a union. - Determine aliasing rules of dynamic arrays from the elem type, instead of treating it as aliasing the overall structure. This might actually be the better solution for us - shall I send you a patch? :o) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2013