D.gnu - Found one (the?) ARM bug: (Pointer) aliasing issues

Johannes Pfau (78/78) Nov 02 2013 I think I finally found the root cause for one of the remaining ARM

Iain Buclaw (12/90) Nov 02 2013 I guess D does not have any aliasing rules. Question: does this occur w...

Johannes Pfau (17/128) Nov 03 2013 Yes, -fno-strict-aliasing also fixes this problem. I'm starting a fresh
Johannes Pfau (4/6) Nov 03 2013 That should work. Alias set 0 is a special alias set which conflicts

Iain Buclaw (58/64) Nov 07 2013 This is taken from hunks in the 2.064 merge I'm testing:

Johannes Pfau (43/60) Nov 07 2013 Some probably stupid questions about this patch:

Iain Buclaw (59/118) Nov 08 2013 That would be seen as two distinct alias sets that would break strict

Johannes Pfau (26/102) Nov 08 2013 Now I get what you mean. AFAIK the backend already strips type

Iain Buclaw (23/125) Nov 09 2013 I believe the difference is char* vs immutable(char)* - which a) is two
Iain Buclaw (5/5) Nov 10 2013 By the way - I've gotten round to upgrading my trimslice to precise - *n...
Iain Buclaw (3/5) Dec 05 2013 We now just assume everything aliases everything else:
Iain Buclaw (3/9) Dec 05 2013 Fixed that unexpected sign-extend conversion bug:
Iain Buclaw (4/15) Dec 16 2013 Actually fixed that duplicate definition problem last week. Just got

Iain Buclaw (5/8) Nov 08 2013 s/_d_arraycopy/_d_arraycast/

Johannes Pfau <nospam example.com> writes:

I think I finally found the root cause for one of the remaining ARM
bugs (the codegen bug which only appears on -O2 or higher).

What happened in the test case was that the gcc backend illegally moved
a read to a memory location before the write. It turns out that GCC
assumed that the read and write locations were in different alias sets
and therefore could not possibly reference the same memory location. One
alias set was for 'ubyte[]' and one for 'char[]'.

The code that triggers this issue is in std.algorithm.find:
--------
R1 find(...)(R1 haystack, R2 needle) if (/* are strings*/)
{
    return cast(char[]) .find!(ubyte[], ubyte[])
        (cast(ubyte[]) haystack, cast(ubyte[])needle);
}
--------
(Real code:
https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/src/std/algorithm.d#L3555
)

The generic emitted by gdc:
--------
return <retval> = *(struct  *) &find (*(struct  *) &haystack, *(struct
*) &needle);
--------

Or in the raw form:
 44 = ubyte[]
 11 = string
--------
 11     record_type      name:  17      size:  18      algn: 32      
                         tag : struct   flds:  19   
 31     pointer_type     size:  15      algn: 32       ptd :  11    
 40     pointer_type     size:  15      algn: 32       ptd :  44   
 41     call_expr        type:  44      fn  :  45      0   :  46     
                         1   :  47
 44     record_type      name:  49      size:  18      algn:  
                         tag : struct   flds:  50    
 46     indirect_ref     type:  44      op 0:  53     
 47     indirect_ref     type:  44      op 0:  54    
 53     nop_expr         type:  40      op 0:  58     
 54     nop_expr         type:  40      op 0:  59    
 58     addr_expr        type:  31      op 0:  30     
 59     addr_expr        type:  31      op 0:  61 
--------

Here it's easy to see that we essentially generate this code:
*(cast(ubyte[]*)(&haystack))
and this is AFAIK a violation of the aliasing rules.

This problem is not observed at -O1 as the function call generates a
new stackframe and we therefore have two distinct memory locations. But
with -O2 inlining removes this copy and we now have variables
referencing the same memory location with type ubyte[] and char[].

I can not provide a reduced testcase as the smallest changes in
compiler codegen (gcc version, gdc commit) or the test case can hide
this issue. It's always reproducible with test15.d in the test suite.
(In older gdc versions the test case does not segfault but it produces
wrong output at -O2. This can be a very subtle difference)

But here's a working code snippet which illustrates the generic
generated by GDC:

-----------------
void main()
{
	char[] in1 = "Test".dup;
        char[] in2 = "Test2".dup;

        char[] result = cast(char[])find(cast(ubyte[])in1,
            cast(ubyte[])in2);
}

ubyte[] find(ubyte[] a, ubyte[] b)
{
    return a;
}
-----------------


So the important question here: Is this a bug in GDC codegen or is the
code in std.algorithm invalid? According to
http://dlang.org/expression.html
"The cast is done as a type paint" so this could indeed be interpreted
as a user mistake. But OTOH that page also talks about a runtime check
of the array .lengths which is clearly missing here.

I'm also wondering if that runtime check can actually fix this
aliasing issue or if it can come up again if the runtime check itself
is inlined?

Nov 02 2013

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 2 November 2013 20:07, Johannes Pfau <nospam example.com> wrote:

 I think I finally found the root cause for one of the remaining ARM
 bugs (the codegen bug which only appears on -O2 or higher).

 What happened in the test case was that the gcc backend illegally moved
 a read to a memory location before the write. It turns out that GCC
 assumed that the read and write locations were in different alias sets
 and therefore could not possibly reference the same memory location. One
 alias set was for 'ubyte[]' and one for 'char[]'.

 The code that triggers this issue is in std.algorithm.find:
 --------
 R1 find(...)(R1 haystack, R2 needle) if (/* are strings*/)
 {
     return cast(char[]) .find!(ubyte[], ubyte[])
         (cast(ubyte[]) haystack, cast(ubyte[])needle);
 }
 --------
 (Real code:

 https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/src/std/algorithm.d#L3555
 )

 The generic emitted by gdc:
 --------
 return <retval> = *(struct  *) &find (*(struct  *) &haystack, *(struct
 *) &needle);
 --------

 Or in the raw form:
  44 = ubyte[]
  11 = string
 --------
  11     record_type      name:  17      size:  18      algn: 32
                          tag : struct   flds:  19
  31     pointer_type     size:  15      algn: 32       ptd :  11
  40     pointer_type     size:  15      algn: 32       ptd :  44
  41     call_expr        type:  44      fn  :  45      0   :  46
                          1   :  47
  44     record_type      name:  49      size:  18      algn:
                          tag : struct   flds:  50
  46     indirect_ref     type:  44      op 0:  53
  47     indirect_ref     type:  44      op 0:  54
  53     nop_expr         type:  40      op 0:  58
  54     nop_expr         type:  40      op 0:  59
  58     addr_expr        type:  31      op 0:  30
  59     addr_expr        type:  31      op 0:  61
 --------

 Here it's easy to see that we essentially generate this code:
 *(cast(ubyte[]*)(&haystack))
 and this is AFAIK a violation of the aliasing rules.

 This problem is not observed at -O1 as the function call generates a
 new stackframe and we therefore have two distinct memory locations. But
 with -O2 inlining removes this copy and we now have variables
 referencing the same memory location with type ubyte[] and char[].

 I can not provide a reduced testcase as the smallest changes in
 compiler codegen (gcc version, gdc commit) or the test case can hide
 this issue. It's always reproducible with test15.d in the test suite.
 (In older gdc versions the test case does not segfault but it produces
 wrong output at -O2. This can be a very subtle difference)

 But here's a working code snippet which illustrates the generic
 generated by GDC:

 -----------------
 void main()
 {
         char[] in1 = "Test".dup;
         char[] in2 = "Test2".dup;

         char[] result = cast(char[])find(cast(ubyte[])in1,
             cast(ubyte[])in2);
 }

 ubyte[] find(ubyte[] a, ubyte[] b)
 {
     return a;
 }
 -----------------


 So the important question here: Is this a bug in GDC codegen or is the
 code in std.algorithm invalid? According to
 http://dlang.org/expression.html
 "The cast is done as a type paint" so this could indeed be interpreted
 as a user mistake. But OTOH that page also talks about a runtime check
 of the array .lengths which is clearly missing here.

 I'm also wondering if that runtime check can actually fix this
 aliasing issue or if it can come up again if the runtime check itself
 is inlined?


I guess D does not have any aliasing rules.  Question: does this occur when
you pass -fno-strict-aliasing?

By default, GDC defines -fstrict-aliasing (which is also enabled by default
in -O2 in GCC), which is nice to have, but if it's proving to be
non-trivial, we can always disallow the optimisation being done in the
middle-end.  See the LANG_HOOKS_GET_ALIAS_SET langhook - last time I
checked, returning 0 disables aliasing rules from taking effect.

Regards.
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Nov 02 2013

Johannes Pfau <nospam example.com> writes:

Am Sun, 3 Nov 2013 02:10:20 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 2 November 2013 20:07, Johannes Pfau <nospam example.com> wrote:
 
 I think I finally found the root cause for one of the remaining ARM
 bugs (the codegen bug which only appears on -O2 or higher).

 What happened in the test case was that the gcc backend illegally
 moved a read to a memory location before the write. It turns out
 that GCC assumed that the read and write locations were in
 different alias sets and therefore could not possibly reference the
 same memory location. One alias set was for 'ubyte[]' and one for
 'char[]'.

 The code that triggers this issue is in std.algorithm.find:
 --------
 R1 find(...)(R1 haystack, R2 needle) if (/* are strings*/)
 {
     return cast(char[]) .find!(ubyte[], ubyte[])
         (cast(ubyte[]) haystack, cast(ubyte[])needle);
 }
 --------
 (Real code:

 https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/src/std/algorithm.d#L3555
 )

 The generic emitted by gdc:
 --------
 return <retval> = *(struct  *) &find (*(struct  *) &haystack,
 *(struct *) &needle);
 --------

 Or in the raw form:
  44 = ubyte[]
  11 = string
 --------
  11     record_type      name:  17      size:  18      algn: 32
                          tag : struct   flds:  19
  31     pointer_type     size:  15      algn: 32       ptd :  11
  40     pointer_type     size:  15      algn: 32       ptd :  44
  41     call_expr        type:  44      fn  :  45      0   :  46
                          1   :  47
  44     record_type      name:  49      size:  18      algn:
                          tag : struct   flds:  50
  46     indirect_ref     type:  44      op 0:  53
  47     indirect_ref     type:  44      op 0:  54
  53     nop_expr         type:  40      op 0:  58
  54     nop_expr         type:  40      op 0:  59
  58     addr_expr        type:  31      op 0:  30
  59     addr_expr        type:  31      op 0:  61
 --------

 Here it's easy to see that we essentially generate this code:
 *(cast(ubyte[]*)(&haystack))
 and this is AFAIK a violation of the aliasing rules.

 This problem is not observed at -O1 as the function call generates a
 new stackframe and we therefore have two distinct memory locations.
 But with -O2 inlining removes this copy and we now have variables
 referencing the same memory location with type ubyte[] and char[].

 I can not provide a reduced testcase as the smallest changes in
 compiler codegen (gcc version, gdc commit) or the test case can hide
 this issue. It's always reproducible with test15.d in the test
 suite. (In older gdc versions the test case does not segfault but
 it produces wrong output at -O2. This can be a very subtle
 difference)

 But here's a working code snippet which illustrates the generic
 generated by GDC:

 -----------------
 void main()
 {
         char[] in1 = "Test".dup;
         char[] in2 = "Test2".dup;

         char[] result = cast(char[])find(cast(ubyte[])in1,
             cast(ubyte[])in2);
 }

 ubyte[] find(ubyte[] a, ubyte[] b)
 {
     return a;
 }
 -----------------


 So the important question here: Is this a bug in GDC codegen or is
 the code in std.algorithm invalid? According to
 http://dlang.org/expression.html
 "The cast is done as a type paint" so this could indeed be
 interpreted as a user mistake. But OTOH that page also talks about
 a runtime check of the array .lengths which is clearly missing here.

 I'm also wondering if that runtime check can actually fix this
 aliasing issue or if it can come up again if the runtime check
 itself is inlined?

 
 
 I guess D does not have any aliasing rules.  Question: does this
 occur when you pass -fno-strict-aliasing?
 
 By default, GDC defines -fstrict-aliasing (which is also enabled by
 default in -O2 in GCC), which is nice to have, but if it's proving to
 be non-trivial, we can always disallow the optimisation being done in
 the middle-end.  See the LANG_HOOKS_GET_ALIAS_SET langhook - last
 time I checked, returning 0 disables aliasing rules from taking
 effect.
 
 Regards.

Yes, -fno-strict-aliasing also fixes this problem. I'm starting a fresh
build on ARM right now to see if this fixes the remaining problems in
the test suite.

I think we should really disable strict aliasing. While it may provide
some nice optimization it's almost impossible for the user to do type
punning / casting with strict aliasing rules. I looked into the gcc
aliasing implementation (alias.c) and the only safe way is (as the
documentation says) to use unions. But then all access has to go
through the union. The often proposed helper functions which just cast
a value through the union are therefore illegal and will fail under
certain circumstances (e.g. if they're inlined). As optimization
(inlining, dead code / dead store elimination and target architecture)
also heavily changes the effects of pointer aliasing it's almost
impossible for a developer to write a safe type cast.

(See also http://d.puremagic.com/issues/show_bug.cgi?id=10750)

Nov 03 2013

Johannes Pfau <nospam example.com> writes:

Am Sun, 3 Nov 2013 02:10:20 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.

That should work. Alias set 0 is a special alias set which conflicts
with everything. I'll check if it works as expected.

Nov 03 2013

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 3 November 2013 10:20, Johannes Pfau <nospam example.com> wrote:

 Am Sun, 3 Nov 2013 02:10:20 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.

 That should work. Alias set 0 is a special alias set which conflicts
 with everything. I'll check if it works as expected.


This is taken from hunks in the 2.064 merge I'm testing:

Pastebin link here:  http://pastebin.com/jxQQL68N


diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc
index 8cba369..a19a59b 100644
--- a/gcc/d/d-lang.cc
+++ b/gcc/d/d-lang.cc
   -73,6 +73,7    const attribute_spec d_attribute_table[] =
 #undef LANG_HOOKS_COMMON_ATTRIBUTE_TABLE
 #undef LANG_HOOKS_ATTRIBUTE_TABLE
 #undef LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE
+#undef LANG_HOOKS_GET_ALIAS_SET
 #undef LANG_HOOKS_TYPES_COMPATIBLE_P
 #undef LANG_HOOKS_BUILTIN_FUNCTION
 #undef LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE
   -95,6 +96,7    const attribute_spec d_attribute_table[] =
 #define LANG_HOOKS_COMMON_ATTRIBUTE_TABLE       d_builtins_attribute_table
 #define LANG_HOOKS_ATTRIBUTE_TABLE              d_attribute_table
 #define LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE      d_format_attribute_table
+#define LANG_HOOKS_GET_ALIAS_SET               d_get_alias_set
 #define LANG_HOOKS_TYPES_COMPATIBLE_P          d_types_compatible_p
 #define LANG_HOOKS_BUILTIN_FUNCTION            d_builtin_function
 #define LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE  d_builtin_function
   -1505,6 +1523,33    d_getdecls (void)
 }


+// Get the alias set corresponding to a type or expression.
+// Return -1 if we don't do anything special.
+
+static alias_set_type
+d_get_alias_set (tree t)
+{
+  if (!TYPE_P (t))
+    return get_alias_set (TREE_TYPE (t));
+
+  Type *dtype = build_dtype (t);
+
+  // If the type is a dynamic array, use the alias set of the basetype.
+  if (dtype && dtype->ty == Tarray)
+    return get_alias_set (dtype->nextOf()->toCtype());
+
+  // Permit type-punning when accessing a union, provided the access
+  // is directly through the union.
+  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
+    {
+      if (TREE_CODE (u) == COMPONENT_REF
+         && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
+       return 0;
+    }
+
+  return -1;
+}
+
 static int
 d_types_compatible_p (tree t1, tree t2)
 {




-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Nov 07 2013

Johannes Pfau <nospam example.com> writes:

Am Thu, 7 Nov 2013 16:14:59 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 3 November 2013 10:20, Johannes Pfau <nospam example.com> wrote:
 
 Am Sun, 3 Nov 2013 02:10:20 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.

 That should work. Alias set 0 is a special alias set which conflicts
 with everything. I'll check if it works as expected.

 
 
 This is taken from hunks in the 2.064 merge I'm testing:
 
 Pastebin link here:  http://pastebin.com/jxQQL68N
 

Some probably stupid questions about this patch:

// If the type is a dynamic array, use the alias set of the basetype.

What exactly does happen in that case? The Tarray type is the
two-field type consisting of length and ptr, right? Currently
TypeDArray->toCtype constructs a two_field_type with size_t and
typeof(Element)*. So according to the C aliasing rules, the TypeDArray
alias set does already conflict with size_t and Element*. It does not
conflict with Element. But I don't know why it should conflict with
Element if we're talking about the slice type here. It would
allow code like this to work: "char[] a; char* b = (cast(char*)&a)" but
I don't see why this should work, it's illegal anyway?

Also, AFAICS it does not help with the problem in std.algorithm:
char[] a;
//cast(ubyte[])a generates:
*cast(ubyte[]*)&a;

Do you think this cast should be illegal in D?
I think if we want to support strict aliasing for the code above we'll
have to do what gcc does for pointers:

Put all array slices - regardless of element type - into the same alias
set and make size_t and void* subsets of this alias set.


// Permit type-punning when accessing a union

Isn't that already guaranteed by GCC? See:

Unions have all their member types added as subsets. So as long as the
reference is through the union GCC knows the union type and it'll
conflict with all member types.



But even if we make those changes to aliasing rules, we'll have to fix
many places in phobos. For example:
https://github.com/D-Programming-Language/phobos/blob/master/std/math.d#L1965
real value;
ushort* vu = cast(ushort*)&value;
AFAICS this will always be invalid with strict aliasing.

https://github.com/D-Programming-Language/phobos/blob/master/std/uuid.d#L468
casts ubyte[16]* to size_t* also illegal, AFAICS.

Are there any statistics about the performance improvements with strict
aliasing? I'm not really sold on the idea of strict aliasing, right now
it looks to me as if it's mainly a way to introduce subtle, hard to
debug and often latent bugs (As whether you really see a problem
depends on optimization)

http://stackoverflow.com/questions/1225741/performance-impact-of-fno-strict-aliasing

Nov 07 2013

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 7 November 2013 18:05, Johannes Pfau <nospam example.com> wrote:

 Am Thu, 7 Nov 2013 16:14:59 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 3 November 2013 10:20, Johannes Pfau <nospam example.com> wrote:

 Am Sun, 3 Nov 2013 02:10:20 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.

 That should work. Alias set 0 is a special alias set which conflicts
 with everything. I'll check if it works as expected.


 This is taken from hunks in the 2.064 merge I'm testing:

 Pastebin link here:  http://pastebin.com/jxQQL68N

 Some probably stupid questions about this patch:

 // If the type is a dynamic array, use the alias set of the basetype.

 What exactly does happen in that case? The Tarray type is the
 two-field type consisting of length and ptr, right? Currently
 TypeDArray->toCtype constructs a two_field_type with size_t and
 typeof(Element)*. So according to the C aliasing rules, the TypeDArray
 alias set does already conflict with size_t and Element*. It does not
 conflict with Element. But I don't know why it should conflict with
 Element if we're talking about the slice type here. It would
 allow code like this to work: "char[] a; char* b = (cast(char*)&a)" but
 I don't see why this should work, it's illegal anyway?

That would be seen as two distinct alias sets that would break strict
aliasing in that example.

Though, will have to implement -Wstrict-aliasing in the front-end to get
any feel for what could potentially be utterly wrong.  But the idea is that
for dynamic arrays, telling gcc to not rely on structural equality to
determine whether or not two dynamic arrays are part of the same alias set.

eg:
byte[] a, long[] b = *(cast (long[]*)&a) should be seen as being different
alias sets, and so *will* be about breaking strict aliasing.

In contrast, string[] a, char[] b = *cast(string[]*)&a) should be seen as
being part of the same alias set, and so the compiler must know that the
two types (which are distinct structures to the backend) could potentially
be referencing the same slice of memory, as to not cause any problems.

For people trying to work around the cast system for dynamic arrays, IMO
they should be punished for it, and told to do it in the correct way that
invokes _d_arraycopy, or do their unsafe work through unions.



 Also, AFAICS it does not help with the problem in std.algorithm:
 char[] a;
 //cast(ubyte[])a generates:
 *cast(ubyte[]*)&a;

 Do you think this cast should be illegal in D?
 I think if we want to support strict aliasing for the code above we'll
 have to do what gcc does for pointers:

 Put all array slices - regardless of element type - into the same alias
 set and make size_t and void* subsets of this alias set.


 // Permit type-punning when accessing a union

 Isn't that already guaranteed by GCC? See:

 Unions have all their member types added as subsets. So as long as the
 reference is through the union GCC knows the union type and it'll
 conflict with all member types.

There is no harm enforcing it in the front-end as well, even if it is just
there to speed up the process of returning what the backend will no doubt
return too.  There's also the (extremely) unlikely event that the guarantee
by GCC might be removed in a later version.


 But even if we make those changes to aliasing rules, we'll have to fix
 many places in phobos. For example:

 https://github.com/D-Programming-Language/phobos/blob/master/std/math.d#L1965
 real value;
 ushort* vu = cast(ushort*)&value;
 AFAICS this will always be invalid with strict aliasing.

Yep, as it should be.  std.math is a danger point for type punning between
pointers and reals, ensuring that type-punning/casting does not get DCE'd,
etc...  This needs to be fixed.


https://github.com/D-Programming-Language/phobos/blob/master/std/uuid.d#L468
 casts ubyte[16]* to size_t* also illegal, AFAICS.

 Are there any statistics about the performance improvements with strict
 aliasing? I'm not really sold on the idea of strict aliasing, right now
 it looks to me as if it's mainly a way to introduce subtle, hard to
 debug and often latent bugs (As whether you really see a problem
 depends on optimization)


 http://stackoverflow.com/questions/1225741/performance-impact-of-fno-strict-aliasing

Not that I'm aware of (other than Ada boasting it's use).  But I'd like to
push the opinion of - although it isn't in the spec, D should be strict
aliasing.  And people should be aware of the problems breaking strict
aliasing (see:
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html)

But first... plan of attack:

- We first disable strict aliasing entirely (lang_hook.get_alias_set => 0)
- Implement -Wstrict-aliasing
- Start turning on strict aliasing for types guaranteed not to be
referencing the same memory location as other types (eg: TypeBasic,
TypeDelegate, TypeSArray, TypeVector).
- Identify gdc problems with implicit code generation that could break
strict aliasing (these are our bugs).
- Identify frontend/library problems that could break strict aliasing
(these are the dmd/phobos developer's bugs).
- Turn on strict aliasing for the remaining types.  For those that cause
problems, we can define a TYPE_LANG_FLAG macro to allow us to tell the
backend if the type can alias any other types.


I still stand by what I say on aliasing rules of D:
- Permit type-punning when accessing through a union
- Dynamic arrays of the same basetype (regardless of qualifiers) may alias
each other/occupy the same slice of memory.

Other possible considerations:

- Most D code pretty much assumes that any object may be accessed via a
void[] or void*.

- C standard allows aliasing between signed and unsigned variants.  It is
therefore likely not unreasonable to do the same for convenience.

- Infact, for the consideration of std.math.  It we could go one step
further and simply build up an alias set list based on the type size over
type distinction.  In this model double/long/byte[8]/short[4]/int[2] would
all be considered as types that could be referencing each other.

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Nov 08 2013

Johannes Pfau <nospam example.com> writes:

Am Fri, 8 Nov 2013 13:12:23 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 7 November 2013 18:05, Johannes Pfau <nospam example.com> wrote:
 
 Am Thu, 7 Nov 2013 16:14:59 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:
 type here. It would allow code like this to work: "char[] a; char*
 b = (cast(char*)&a)" but I don't see why this should work, it's
 illegal anyway?

 That would be seen as two distinct alias sets that would break strict
 aliasing in that example.
 
 Though, will have to implement -Wstrict-aliasing in the front-end to
 get any feel for what could potentially be utterly wrong.  But the
 idea is that for dynamic arrays, telling gcc to not rely on
 structural equality to determine whether or not two dynamic arrays
 are part of the same alias set.
 
 eg:
 byte[] a, long[] b = *(cast (long[]*)&a) should be seen as being
 different alias sets, and so *will* be about breaking strict aliasing.
 
 In contrast, string[] a, char[] b = *cast(string[]*)&a) should be
 seen as being part of the same alias set, and so the compiler must
 know that the two types (which are distinct structures to the
 backend) could potentially be referencing the same slice of memory,
 as to not cause any problems.

Now I get what you mean. AFAIK the backend already strips type
qualifiers when building alias sets, so char[] and string[] should
already have the same alias set. But if you want to explicitly
enforce this in GDC this makes sense.

 
 For people trying to work around the cast system for dynamic arrays,
 IMO they should be punished for it, and told to do it in the correct
 way that invokes _d_arraycopy, or do their unsafe work through unions.
 

OK. Although then we have to be very careful when implementing
_d_arraycast (but maybe we can just check if we're in _d_arraycast and
disable aliasing? Is this possible?).

 // Permit type-punning when accessing a union

 Isn't that already guaranteed by GCC? See:

 [...]

 There is no harm enforcing it in the front-end as well, [...]
 

OK


 
 Not that I'm aware of (other than Ada boasting it's use).  But I'd
 like to push the opinion of - although it isn't in the spec, D should
 be strict aliasing.  And people should be aware of the problems
 breaking strict aliasing (see:
 http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html)

I only found this: 
http://docs.lib.purdue.edu/ecetr/123/

They list performance improvements in the range of 0-6% depending on
architecture. It's testing GCC from 2004 so some things may have
changed, but 0-6% is not bad.

 But first... plan of attack:
 
 - We first disable strict aliasing entirely (lang_hook.get_alias_set
 => 0)
 - Implement -Wstrict-aliasing
 - Start turning on strict aliasing for types guaranteed not to be
 referencing the same memory location as other types (eg: TypeBasic,
 TypeDelegate, TypeSArray, TypeVector).
 - Identify gdc problems with implicit code generation that could break
 strict aliasing (these are our bugs).
 - Identify frontend/library problems that could break strict aliasing
 (these are the dmd/phobos developer's bugs).
 - Turn on strict aliasing for the remaining types.  For those that
 cause problems, we can define a TYPE_LANG_FLAG macro to allow us to
 tell the backend if the type can alias any other types.

Sounds good. If we can convince Walter to do this step-by-step
(especially -Wstrict-aliasing needs to be in all compilers) this should
work. But if GDC started enforcing strict aliasing and DMD didn't people
probably wouldn't fix their code.

Maybe the ARM port will be finished when we start working on this list
so we have one more architecture to verify. ~5 phobos unit tests
remaining ;-) (test suite already passes)

 
 Other possible considerations:
 
 - Most D code pretty much assumes that any object may be accessed via
 a void[] or void*.

Probably ubyte*, ubyte[] as well.

 
 - Infact, for the consideration of std.math.  It we could go one step
 further and simply build up an alias set list based on the type size
 over type distinction.  In this model
 double/long/byte[8]/short[4]/int[2] would all be considered as types
 that could be referencing each other.
 

I don't think that really helps much. Think of real with 16bytes. This
may also negatively impact possible optimizations (I'd guess many types
are word-sized. Another place where statistics would be nice)

Nov 08 2013

Iain Buclaw <ibuclaw ubuntu.com> writes:

On Nov 8, 2013 7:20 PM, "Johannes Pfau" <nospam example.com> wrote:
 Am Fri, 8 Nov 2013 13:12:23 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 7 November 2013 18:05, Johannes Pfau <nospam example.com> wrote:

 Am Thu, 7 Nov 2013 16:14:59 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:
 type here. It would allow code like this to work: "char[] a; char*
 b = (cast(char*)&a)" but I don't see why this should work, it's
 illegal anyway?

 That would be seen as two distinct alias sets that would break strict
 aliasing in that example.

 Though, will have to implement -Wstrict-aliasing in the front-end to
 get any feel for what could potentially be utterly wrong.  But the
 idea is that for dynamic arrays, telling gcc to not rely on
 structural equality to determine whether or not two dynamic arrays
 are part of the same alias set.

 eg:
 byte[] a, long[] b = *(cast (long[]*)&a) should be seen as being
 different alias sets, and so *will* be about breaking strict aliasing.

 In contrast, string[] a, char[] b = *cast(string[]*)&a) should be
 seen as being part of the same alias set, and so the compiler must
 know that the two types (which are distinct structures to the
 backend) could potentially be referencing the same slice of memory,
 as to not cause any problems.

 Now I get what you mean. AFAIK the backend already strips type
 qualifiers when building alias sets, so char[] and string[] should
 already have the same alias set. But if you want to explicitly
 enforce this in GDC this makes sense.

I believe the difference is char* vs immutable(char)* - which a) is two
distinct struct types in the back end and b) the strip mod stuff would see
both as 'nothing to do' as the pointer has no qualifiers directly attached
to it.

Now I stop and think about dynamic arrays, I believe there's some work I
need to do on improving debugging of them (both in gdb and when debugging
gcc) - for instance when printing the type of an &array, is it still shown
as an unnamed 'struct*' ? :)

 For people trying to work around the cast system for dynamic arrays,
 IMO they should be punished for it, and told to do it in the correct
 way that invokes _d_arraycopy, or do their unsafe work through unions.

 OK. Although then we have to be very careful when implementing
 _d_arraycast (but maybe we can just check if we're in _d_arraycast and
 disable aliasing? Is this possible?).

Should be ok if we give special treatment for void[]. I can't remember the
exact reason why there is a dereferenced cast in that function, might just
be a pointer copy... (when I think array cast,  I only see array.length =
(length * F.sizeof) / T.sizeof;

 // Permit type-punning when accessing a union

 Isn't that already guaranteed by GCC? See:

 [...]

 There is no harm enforcing it in the front-end as well, [...]

 OK


 Not that I'm aware of (other than Ada boasting it's use).  But I'd
 like to push the opinion of - although it isn't in the spec, D should
 be strict aliasing.  And people should be aware of the problems
 breaking strict aliasing (see:


http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
)
 I only found this:
 http://docs.lib.purdue.edu/ecetr/123/

 They list performance improvements in the range of 0-6% depending on
 architecture. It's testing GCC from 2004 so some things may have
 changed, but 0-6% is not bad.

 But first... plan of attack:

 - We first disable strict aliasing entirely (lang_hook.get_alias_set
 => 0)
 - Implement -Wstrict-aliasing
 - Start turning on strict aliasing for types guaranteed not to be
 referencing the same memory location as other types (eg: TypeBasic,
 TypeDelegate, TypeSArray, TypeVector).
 - Identify gdc problems with implicit code generation that could break
 strict aliasing (these are our bugs).
 - Identify frontend/library problems that could break strict aliasing
 (these are the dmd/phobos developer's bugs).
 - Turn on strict aliasing for the remaining types.  For those that
 cause problems, we can define a TYPE_LANG_FLAG macro to allow us to
 tell the backend if the type can alias any other types.

 Sounds good. If we can convince Walter to do this step-by-step
 (especially -Wstrict-aliasing needs to be in all compilers) this should
 work. But if GDC started enforcing strict aliasing and DMD didn't people
 probably wouldn't fix their code.

 Maybe the ARM port will be finished when we start working on this list
 so we have one more architecture to verify. ~5 phobos unit tests
 remaining ;-) (test suite already passes)

 Other possible considerations:

 - Most D code pretty much assumes that any object may be accessed via
 a void[] or void*.

 Probably ubyte*, ubyte[] as well.

 - Infact, for the consideration of std.math.  It we could go one step
 further and simply build up an alias set list based on the type size
 over type distinction.  In this model
 double/long/byte[8]/short[4]/int[2] would all be considered as types
 that could be referencing each other.

 I don't think that really helps much. Think of real with 16bytes. This
 may also negatively impact possible optimizations (I'd guess many types
 are word-sized. Another place where statistics would be nice)

reals are special in that they are flexible in size. std.math is able to
cope with this, but assumes that the target uses IEEE format. See my
implementation of floor() or lrint() for instance which uses different code
paths for each support float.

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Nov 09 2013

Iain Buclaw <ibuclaw ubuntu.com> writes:

By the way - I've gotten round to upgrading my trimslice to precise - *now*
I should be able to start doing regular gdc builds on it. :o)

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Nov 10 2013

Iain Buclaw <ibuclaw gdcproject.org> writes:

On 10 November 2013 19:08, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 By the way - I've gotten round to upgrading my trimslice to precise - *now*
 I should be able to start doing regular gdc builds on it. :o)


We now just assume everything aliases everything else:
https://github.com/D-Programming-GDC/GDC/commit/926baf64a57895142f8cabf3cee97232b13777d3

Dec 05 2013

Iain Buclaw <ibuclaw gdcproject.org> writes:

On 5 December 2013 16:48, Iain Buclaw <ibuclaw gdcproject.org> wrote:
 On 10 November 2013 19:08, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 By the way - I've gotten round to upgrading my trimslice to precise - *now*
 I should be able to start doing regular gdc builds on it. :o)


 We now just assume everything aliases everything else:
 https://github.com/D-Programming-GDC/GDC/commit/926baf64a57895142f8cabf3cee97232b13777d3


Fixed that unexpected sign-extend conversion bug:
https://github.com/D-Programming-GDC/GDC/commit/a27f600facf7ae87abad0c5618d712b32e2e3882

Dec 05 2013

Iain Buclaw <ibuclaw gdcproject.org> writes:

On 5 December 2013 17:46, Iain Buclaw <ibuclaw gdcproject.org> wrote:
 On 5 December 2013 16:48, Iain Buclaw <ibuclaw gdcproject.org> wrote:
 On 10 November 2013 19:08, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 By the way - I've gotten round to upgrading my trimslice to precise - *now*
 I should be able to start doing regular gdc builds on it. :o)


 We now just assume everything aliases everything else:
 https://github.com/D-Programming-GDC/GDC/commit/926baf64a57895142f8cabf3cee97232b13777d3


 Fixed that unexpected sign-extend conversion bug:
 https://github.com/D-Programming-GDC/GDC/commit/a27f600facf7ae87abad0c5618d712b32e2e3882

Actually fixed that duplicate definition problem last week. Just got
round to pushing it:
https://github.com/D-Programming-GDC/GDC/commit/15d7b5ce334e851893d39efc148699dab4967cab

Dec 16 2013

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 8 November 2013 13:12, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 For people trying to work around the cast system for dynamic arrays, IMO
 they should be punished for it, and told to do it in the correct way that
 invokes _d_arraycopy, or do their unsafe work through unions.

s/_d_arraycopy/_d_arraycast/


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Nov 08 2013

D Programming

C/C++ Programming

Other

D.gnu - Found one (the?) ARM bug: (Pointer) aliasing issues