www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Language features and reinterpret casts

reply bearophile <bearophileHUGS lycos.com> writes:
On Reddit I have read some people complain that D2 is a too much complex
language because it has too many features. Feature "count" increases a language
complexity, but most of the complexity comes from other sources, like unwanted
interactions between features, messy feature semantics, unspecified behaviour
in corner cases, special cases, special cases of special cases, etc.

But tidy features able to solve a defined class of problems usually decrease
the amount of work of the programmer.

Recently Bradley Mitchell in D.learn newsgroup has tried to implement the Quake
fast inverse square root algorithm in D, and has found D lack the C++
reinterpret cast:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=21901

If you use the current dmd you are able to solve this problem using a union to
reinterpret the bits. But:
- C Standard says that assigning to one member of a union and then accessing a
different member is undefined behaviour.
- GCC is a practical compiler, so it has the -fno-strict-aliasing switch, that
allows to use that trick. See "-Wstrict-aliasing" and:
http://stackoverflow.com/questions/2906365/gcc-strict-aliasing-and-casting-through-a-union
- Walter said that regarding such union semantics D acts as C. So that union
trick will break with other future D compilers.

The need to perform a reinterpret cast is uncommon, I don't need it in Python,
but it does happen in a system language.

So all this situation looks silly to me. A modern language as D can't force
programmer to use undefined tricks to do something uncommon but licit. Adding a
feature here increases language complexity a little, but removes troubles and
makes coding simpler (no need to remember or invent tricks), safer (you are
sure the compiler will act as you want), more explicit (because the person that
reads the code doesn't need to reverse engineer the trick), more efficient
(because if you have to use -fno-strict-aliasing it may reduce performance a
little).

So unless you have a different solution, I think D is better with a
reinterpret_cast(). An alternative solution is to put a standard
reinterpretCast!(U)(x) into Phobos2, and then let people that implement future
D compilers define its semantics correctly for that compiler. So the user code
that uses this template is portable, safe, efficient, explicit in its purpose.

Bye,
bearophile
Sep 20 2010
next sibling parent reply klickverbot <see klickverbot.at> writes:
On 9/20/10 11:52 PM, bearophile wrote:
 Recently Bradley Mitchell in D.learn newsgroup has tried to implement the
Quake fast inverse square root algorithm in D, and has found D lack the C++
reinterpret cast:
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=21901
Are there any cases where (*cast(int*)&someFloat) does not fit the bill?
Sep 20 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
klickverbot:
 Are there any cases where (*cast(int*)&someFloat) does not fit the bill?
I am not a C lawyer, but I think that too is undefined in C (and maybe D too). Bye, bearophile
Sep 20 2010
next sibling parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
bearophile <bearophileHUGS lycos.com> wrote:

 klickverbot:
 Are there any cases where (*cast(int*)&someFloat) does not fit the bill?
I am not a C lawyer, but I think that too is undefined in C (and maybe D too).
From your own link (http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=109033): "I don't see any way to make conversions between pointers and ints implementation defined, and make dereferencing a pointer coming from some int anything but undefined behavior." -- Simen
Sep 21 2010
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 21/09/2010 09:23, Simen kjaeraas wrote:
 bearophile <bearophileHUGS lycos.com> wrote:

 klickverbot:
 Are there any cases where (*cast(int*)&someFloat) does not fit the bill?
I am not a C lawyer, but I think that too is undefined in C (and maybe D too).
From your own link (http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=109033): "I don't see any way to make conversions between pointers and ints implementation defined, and make dereferencing a pointer coming from some int anything but undefined behavior."
That's not the same thing. Walter was referring to casting an int to a pointer, but the above is casting a _float pointer_ to an _int pointer_. -- Bruno Medeiros - Software Engineer
Nov 01 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 21/09/2010 00:27, bearophile wrote:
 klickverbot:
 Are there any cases where (*cast(int*)&someFloat) does not fit the bill?
I am not a C lawyer, but I think that too is undefined in C (and maybe D too). Bye, bearophile
In general, it is definitely undefined behavior in C, but that's because an int in C can be bigger than a float, so you could be reading memory out of bounds with that. If sizeof(int) == sizeof(float), then it's legal in C. Similarly, I think it is legal in D. -- Bruno Medeiros - Software Engineer
Nov 01 2010
prev sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
bearophile Wrote:

 - C Standard says that assigning to one member of a union and then accessing a
different member is undefined behaviour.
 - GCC is a practical compiler, so it has the -fno-strict-aliasing switch, that
allows to use that trick. See "-Wstrict-aliasing" and:
 http://stackoverflow.com/questions/2906365/gcc-strict-aliasing-and-casting-through-a-union
 - Walter said that regarding such union semantics D acts as C. So that union
trick will break with other future D compilers.
I believe that defined behavior within an area where C is specified to be undefined is completely in line with operating as C does. But where does Walter say Unions act as C in this case? Why not, define the behavior and leave it.
Sep 20 2010
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, September 20, 2010 18:21:53 Jesse Phillips wrote:
 bearophile Wrote:
 - C Standard says that assigning to one member of a union and then
 accessing a different member is undefined behaviour. - GCC is a
 practical compiler, so it has the -fno-strict-aliasing switch, that
 allows to use that trick. See "-Wstrict-aliasing" and:
 http://stackoverflow.com/questions/2906365/gcc-strict-aliasing-and-casti
 ng-through-a-union - Walter said that regarding such union semantics D
 acts as C. So that union trick will break with other future D compilers.
I believe that defined behavior within an area where C is specified to be undefined is completely in line with operating as C does. But where does Walter say Unions act as C in this case? Why not, define the behavior and leave it.
Well, since C doesn't define the behavior, D can do anything - including using behavior that it defines and considers entirely consistent. So, as long as it's D code, I see no reason why D can't use the same constructs as C code but have things which are undefined in C be defined in D. The only problem is if you port code from D to C, but that's not exactly something that we're generally worried about. Overall though, I get the impression that Walter's goal is to get as close to having no undefined behavior in D as he reasonably can. Whether the behavior is defined in another language or not isn't particularly relevant. - Jonathan M Davis
Sep 20 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jesse Phillips:

 But where does Walter say Unions act as C in this case?
It was an answer in a thread of mine where I have asked to remove some D undefined behaviours derived from C. This is the thread start: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108978
 Why not, define the behavior and leave it.
This was the original purpose of my thread, but Walter has explained me that this is not possible for the unions: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=109033 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=109058 Bye, bearophile
Sep 20 2010
next sibling parent reply BCS <none anon.com> writes:
Hello bearophile,

 Jesse Phillips:
 
 Why not, define the behavior and leave it.
 
This was the original purpose of my thread, but Walter has explained me that this is not possible for the unions:
I don't see how those apply to unions. -- ... <IXOYE><
Sep 20 2010
parent reply Don <nospam nospam.com> writes:
BCS wrote:
 Hello bearophile,
 
 Jesse Phillips:

 Why not, define the behavior and leave it.
This was the original purpose of my thread, but Walter has explained me that this is not possible for the unions:
I don't see how those apply to unions.
It's going to be implementation defined behaviour (will depend on endianness) but not undefined behaviour. BTW std.math heavily relies on reinterpret casting float->int.
Sep 20 2010
next sibling parent reply BCS <none anon.com> writes:
Hello Don,

 BCS wrote:
 
 Hello bearophile,
 
 Jesse Phillips:
 
 Why not, define the behavior and leave it.
 
This was the original purpose of my thread, but Walter has explained me that this is not possible for the unions:
I don't see how those apply to unions.
It's going to be implementation defined behaviour (will depend on endianness)
Does that matter if the members are the same size?
 but not undefined behaviour.
 BTW std.math heavily relies on reinterpret casting float->int.
-- ... <IXOYE><
Sep 20 2010
parent reply Don <nospam nospam.com> writes:
BCS wrote:
 Hello Don,
 
 BCS wrote:

 Hello bearophile,

 Jesse Phillips:

 Why not, define the behavior and leave it.
This was the original purpose of my thread, but Walter has explained me that this is not possible for the unions:
I don't see how those apply to unions.
It's going to be implementation defined behaviour (will depend on endianness)
Does that matter if the members are the same size?
Definitely -- you'll grab the part of the mantissa, instead of the exponent. It should however only be hardware-and-OS-dependent, rather than compiler-dependent. (It can be OS-dependent because of different padding rules).
 
 but not undefined behaviour.
 BTW std.math heavily relies on reinterpret casting float->int.
Sep 21 2010
parent reply BCS <none anon.com> writes:
Hello Don,

 BCS wrote:
 
 Hello Don,

 It's going to be implementation defined behaviour (will depend on
 endianness)
 
Does that matter if the members are the same size?
Definitely -- you'll grab the part of the mantissa, instead of the exponent.
How would that work? Can a system store ints big-endian and floats little-endian? -- ... <IXOYE><
Sep 21 2010
parent reply Don <nospam nospam.com> writes:
BCS wrote:
 Hello Don,
 
 BCS wrote:

 Hello Don,

 It's going to be implementation defined behaviour (will depend on
 endianness)
Does that matter if the members are the same size?
Definitely -- you'll grab the part of the mantissa, instead of the exponent.
How would that work? Can a system store ints big-endian and floats little-endian?
Not sure. My limited understanding is that with on PowerPC, Altivec can use big-endian floats, even when the PPC is set to little-endian mode. Could be wrong, though. In any case, for 80-bit reals, there is no int which is the same size.
Sep 21 2010
parent BCS <none anon.com> writes:
Hello Don,

 BCS wrote:
 
 Hello Don,
 
 How would that work? Can a system store ints big-endian and floats
 little-endian?
 
Not sure. My limited understanding is that with on PowerPC, Altivec can use big-endian floats, even when the PPC is set to little-endian mode. Could be wrong, though. In any case, for 80-bit reals, there is no int which is the same size.
So how tightly can we define it without losing to much? If we define a union where all the members are of the same size to, overplayed exactly and requiter the to consistent use of big/little-endian, would that make anything horribly inefficient? Even if that rules out PPC vector ops, I don't think that would be a big issue. -- ... <IXOYE><
Sep 21 2010
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Don:
 It's going to be implementation defined behaviour (will depend on 
 endianness) but not undefined behaviour.
 BTW std.math heavily relies on reinterpret casting float->int.
I suggest to add something like a std.traits.ReinterpretCast template to Phobos and use only it in Phobos modules and user code. This helps readability and will help future more aggressively optimizing D compilers too. Bye, bearophile
Sep 21 2010
prev sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
I remember that thread and even replied to it. I didn't see anything where
Walter said Unions behave as C. Nor was there anything where Walter said he
would not define C undefined behavior. All he said was, for the cases
presented, there was no way to define the behavior or didn't know what defined
behavior would be.

I don't know if it would be reasonable to define the behavior of Union for
this, but since it currently works in D I assume it can be defined to work that
way.

bearophile Wrote:

 Jesse Phillips:
 
 But where does Walter say Unions act as C in this case?
It was an answer in a thread of mine where I have asked to remove some D undefined behaviours derived from C. This is the thread start: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108978
 Why not, define the behavior and leave it.
This was the original purpose of my thread, but Walter has explained me that this is not possible for the unions: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=109033 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=109058 Bye, bearophile
Sep 21 2010