www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Strict aliasing in D

reply Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
It was originally posted to D.learn but there was no reply. So:

Is there a strict aliasing rule in D?

I just saw https://bitbucket.org/goshawk/gdc/changeset/b44331053062
Jan 29 2012
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Denis Shelomovskij" <verylonglogin.reg gmail.com> wrote in message 
news:jg3f21$1jqa$1 digitalmars.com...
 It was originally posted to D.learn but there was no reply. So:

 Is there a strict aliasing rule in D?

 I just saw https://bitbucket.org/goshawk/gdc/changeset/b44331053062

Struct aliasing is required when doing array operations. eg. int[] a, b, c; a[] = b[] + c[]; The arrays used must not overlap. I'm pretty sure that's what that commit was about.
Jan 29 2012
next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Sunday, 29 January 2012 at 14:05:25 UTC, Daniel Murphy wrote:
 "Denis Shelomovskij" <verylonglogin.reg gmail.com> wrote in 
 message news:jg3f21$1jqa$1 digitalmars.com...
 It was originally posted to D.learn but there was no reply. So:

 Is there a strict aliasing rule in D?

 I just saw 
 https://bitbucket.org/goshawk/gdc/changeset/b44331053062

Struct aliasing is required when doing array operations. eg. int[] a, b, c; a[] = b[] + c[]; The arrays used must not overlap. I'm pretty sure that's what that commit was about.

That's not strict aliasing, that's just a language rule about aliasing for vector ops. Strict aliasing is the assumption that no two pointers of different types point to the same location (char is an exception). For example, with -fstrict-aliasing in gcc, the following could generate unexpected code: int* i = new int; float* f = (float*)i; *i = 0; *f = 1.0f; printf("%d\n", *i); // could still write 0 Because of strict aliasing, the compiler may assume that the assignment *f = 1.0f can't affect the int, so it could just print out 0 instead of whatever 1.0f is as an int. To get around it, you would have to either use a union (which is still implementation defined, but works around aliasing), or make i volatile. As for D, I can't see anything in the standard that prevents two pointers of different types from pointing to the same location, but I suspect it is an assumption that is being made.
Jan 29 2012
next sibling parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Peter Alexander" <peter.alexander.au gmail.com> wrote in message 
news:ggzqksxaiccnkvztmsql dfeed.kimsufi.thecybershadow.net...
 That's not strict aliasing, that's just a language rule about aliasing for 
 vector ops.

Jan 29 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/26/2013 12:45 PM, monarch_dodra wrote:
 On Sunday, 29 January 2012 at 16:25:33 UTC, Peter Alexander wrote:
 As for D, I can't see anything in the standard that prevents two pointers of
 different types from pointing to the same location, but I suspect it is an
 assumption that is being made.

Resurrecting this old thread, maybe we'll get a better answer this time. I too am interested in knowing how D deals with pointer aliasing. I'd like a bit more of an "official" or "factual" answer.

Although it isn't in the spec, D should be "strict aliasing". This is because: 1. it enables better code generation 2. there are ways, such as unions, to get the other aliasing that doesn't break strict aliasing
Jul 26 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/27/2013 1:08 AM, monarch_dodra wrote:
 1. Does strict aliasing apply to slices?

I don't know what you mean.
 2. C++ uses 'char' as a 'neutral' type that can alias to anything. What about
D?
 Does char fill that role? Does ubyte?

I'll go with deadalnix's answer.
Jul 27 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/27/2013 1:57 AM, David Nadlinger wrote:
 On Saturday, 27 July 2013 at 06:58:04 UTC, Walter Bright wrote:
 Although it isn't in the spec, D should be "strict aliasing". This is because:

 1. it enables better code generation

 2. there are ways, such as unions, to get the other aliasing that doesn't
 break strict aliasing

We need to carefully formalize this then, and quickly. The problem GCC, Clang and others are facing is that (as you are probably aware) 2. isn't guaranteed to work for type-casting pointers either by the specs, but people want to be able to do this nonetheless. Thus, they both accept pointer aliasing through union types, trying to optimize as much as possible while avoiding to break people's expectations and existing code. This is a very unfortunate situation for both compiler developers and users; just search for something like "gcc strict aliasing" on StackOverflow for examples. There is already quite a lot of D code out there that violates the C-style strict aliasing rules.

I agree. Want to do an enhancement request on bugzilla for it?
Jul 27 2013
parent Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
27.07.2013 12:59, Walter Bright пишет:
 On 7/27/2013 1:57 AM, David Nadlinger wrote:
 On Saturday, 27 July 2013 at 06:58:04 UTC, Walter Bright wrote:
 Although it isn't in the spec, D should be "strict aliasing". This is
 because:

 1. it enables better code generation

 2. there are ways, such as unions, to get the other aliasing that
 doesn't
 break strict aliasing

We need to carefully formalize this then, and quickly. The problem GCC, Clang and others are facing is that (as you are probably aware) 2. isn't guaranteed to work for type-casting pointers either by the specs, but people want to be able to do this nonetheless. Thus, they both accept pointer aliasing through union types, trying to optimize as much as possible while avoiding to break people's expectations and existing code. This is a very unfortunate situation for both compiler developers and users; just search for something like "gcc strict aliasing" on StackOverflow for examples. There is already quite a lot of D code out there that violates the C-style strict aliasing rules.

I agree. Want to do an enhancement request on bugzilla for it?

So is enhancement request filed? -- Денис В. Шеломовский Denis V. Shelomovskij
Jul 28 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Sunday, 29 January 2012 at 16:25:33 UTC, Peter Alexander wrote:
 As for D, I can't see anything in the standard that prevents 
 two pointers of different types from pointing to the same 
 location, but I suspect it is an assumption that is being made.

Resurrecting this old thread, maybe we'll get a better answer this time. I too am interested in knowing how D deals with pointer aliasing. I'd like a bit more of an "official" or "factual" answer.
Jul 26 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Saturday, 27 July 2013 at 06:58:04 UTC, Walter Bright wrote:
 On 7/26/2013 12:45 PM, monarch_dodra wrote:
 On Sunday, 29 January 2012 at 16:25:33 UTC, Peter Alexander 
 wrote:
 As for D, I can't see anything in the standard that prevents 
 two pointers of
 different types from pointing to the same location, but I 
 suspect it is an
 assumption that is being made.

Resurrecting this old thread, maybe we'll get a better answer this time. I too am interested in knowing how D deals with pointer aliasing. I'd like a bit more of an "official" or "factual" answer.

Although it isn't in the spec, D should be "strict aliasing". This is because: 1. it enables better code generation 2. there are ways, such as unions, to get the other aliasing that doesn't break strict aliasing

Thank you for the answer. I expected D to do strict aliasing for the reasons you mentioned. This does come up with two follow up question though: 1. Does strict aliasing apply to slices? 2. C++ uses 'char' as a 'neutral' type that can alias to anything. What about D? Does char fill that role? Does ubyte?
Jul 27 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 27 July 2013 at 08:08:01 UTC, monarch_dodra wrote:
 Thank you for the answer. I expected D to do strict aliasing 
 for the reasons you mentioned. This does come up with two 
 follow up question though:

 1. Does strict aliasing apply to slices?
 2. C++ uses 'char' as a 'neutral' type that can alias to 
 anything. What about D? Does char fill that role? Does ubyte?

We have void* and void[], I think they should have that role.
Jul 27 2013
prev sibling next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Saturday, 27 July 2013 at 06:58:04 UTC, Walter Bright wrote:
 Although it isn't in the spec, D should be "strict aliasing". 
 This is because:

 1. it enables better code generation

 2. there are ways, such as unions, to get the other aliasing 
 that doesn't break strict aliasing

We need to carefully formalize this then, and quickly. The problem GCC, Clang and others are facing is that (as you are probably aware) 2. isn't guaranteed to work for type-casting pointers either by the specs, but people want to be able to do this nonetheless. Thus, they both accept pointer aliasing through union types, trying to optimize as much as possible while avoiding to break people's expectations and existing code. This is a very unfortunate situation for both compiler developers and users; just search for something like "gcc strict aliasing" on StackOverflow for examples. There is already quite a lot of D code out there that violates the C-style strict aliasing rules. David
Jul 27 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Saturday, 27 July 2013 at 08:35:36 UTC, deadalnix wrote:
 On Saturday, 27 July 2013 at 08:08:01 UTC, monarch_dodra wrote:
 Thank you for the answer. I expected D to do strict aliasing 
 for the reasons you mentioned. This does come up with two 
 follow up question though:

 1. Does strict aliasing apply to slices?
 2. C++ uses 'char' as a 'neutral' type that can alias to 
 anything. What about D? Does char fill that role? Does ubyte?

We have void* and void[], I think they should have that role.

You can't read or write to void though (can you?), which is one of the main points in doing an un-strict alias: Raw binary storage.
Jul 27 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Saturday, 27 July 2013 at 08:58:22 UTC, Walter Bright wrote:
 On 7/27/2013 1:08 AM, monarch_dodra wrote:
 1. Does strict aliasing apply to slices?

I don't know what you mean.

double d; uint* p = cast(int*)&d; //unsafe aliasing vs double[] d = new double[](1); uint[] p = cast(uint[])d; //unsafe aliasing ?
Jul 27 2013
prev sibling next sibling parent "ponce" <spam spambox.com> writes:
On Saturday, 27 July 2013 at 06:58:04 UTC, Walter Bright wrote:
 2. there are ways, such as unions, to get the other aliasing 
 that doesn't break strict aliasing

It would be great to have something like GCC's solution: warn when pointer casts may violate the strict aliasing rule, and provide a flag to disable it.
Jul 27 2013
prev sibling next sibling parent "ponce" <spam spambox.com> writes:
On Saturday, 27 July 2013 at 09:05:32 UTC, ponce wrote:
 It would be great to have something like GCC's solution: warn 
 when pointer casts may violate the strict aliasing rule, and 
 provide a flag to disable it.

BTW, C++ compilers usually have an effective way to disambiguate pointer aliasing so that loop code generation is better. restrict may helps, strict aliasing may helps, but in my experience a direct annotation "I know what I'm doing, this loop does not alias" or "no loop dependency" is often more effective. This obviously require to trust the optimizing programmer a bit :)
Jul 27 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 Although it isn't in the spec, D should be "strict aliasing". 
 This is because:

 1. it enables better code generation

 2. there are ways, such as unions, to get the other aliasing 
 that doesn't break strict aliasing

Is it good to add to Phobos a small template (named like "PointerCast" or something similar) that uses a union internally to perform pointer type conversions? Is then the compiler going to warn the programmer when the pointer type aliasing rule is violated? I mean when the D code uses cast() between different pointer types (beside constness). An alternative design is to even deprecate (and later turn those into errors, where the error message suggests to use PointerCast). Bye, bearophile
Jul 27 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 27 July 2013 at 09:03:57 UTC, monarch_dodra wrote:
 On Saturday, 27 July 2013 at 08:58:22 UTC, Walter Bright wrote:
 On 7/27/2013 1:08 AM, monarch_dodra wrote:
 1. Does strict aliasing apply to slices?

I don't know what you mean.

double d; uint* p = cast(int*)&d; //unsafe aliasing vs double[] d = new double[](1); uint[] p = cast(uint[])d; //unsafe aliasing ?

That is the same thing at the end.
Jul 28 2013
prev sibling next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Monday, 29 July 2013 at 05:05:54 UTC, Denis Shelomovskij wrote:
 So is enhancement request filed?

Now it is: http://d.puremagic.com/issues/show_bug.cgi?id=10750 Sorry, I'm not following the lists closely right now due to university work. David
Aug 03 2013
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Saturday, 3 August 2013 at 11:39:44 UTC, David Nadlinger wrote:
 On Monday, 29 July 2013 at 05:05:54 UTC, Denis Shelomovskij 
 wrote:
 So is enhancement request filed?

Now it is: http://d.puremagic.com/issues/show_bug.cgi?id=10750 Sorry, I'm not following the lists closely right now due to university work. David

C++ (and C, afaik), solved the problem by saying "char" can alias anything. I'm not sure that's a good idea to do the same in D, since a char actually represents something very specific, and comes with bagage. We could have (u)byte do this, but at the same time, I see now reason why *they* should have to pay for lax aliasing. A "simple" solutions I see: Introduce the "raw" basic data type. It's basically a ubyte for all intents and purposes, but can alias to anything. I think this is a good idea, as the useage of the ward "raw" is immediately very explicit and self documenting about what is going on: auto rawSlice = (cast(raw*)(&arbitraryData))[0 .. arbitraryData.sizeof]; The *cost* here, of course, is the introduction of a new *type*. The cost is very high, but at the same type, it deals with the problem (I believe), in the most elegant fashion possible. Thoughts?
Aug 03 2013