digitalmars.D.bugs - [Issue 7328] New: Allow casting between ubyte[4] and int

d-bugmail puremagic.com (38/38) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328

d-bugmail puremagic.com (11/11) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328
d-bugmail puremagic.com (11/11) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328
d-bugmail puremagic.com (12/12) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328
d-bugmail puremagic.com (19/19) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328
d-bugmail puremagic.com (9/9) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328
d-bugmail puremagic.com (7/7) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328
d-bugmail puremagic.com (22/24) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328
d-bugmail puremagic.com (23/50) Jan 20 2012 http://d.puremagic.com/issues/show_bug.cgi?id=7328

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328

           Summary: Allow casting between ubyte[4] and int
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: jmdavisProg gmx.com



PST ---
It would be very nice to be able to cast between arrays of ubyte and integral
type as long as they're the same size. So, ubyte[4] -> int, ubyte[2] -> short,
etc. Maybe even ubyteArr[0 .. 4] -> int as long as the indices are known at
compile time.

As it stands, the only two ways that I can think of doing this are to use a
union, e.g.

union IntegerT)
    if(isIntegral!T)
{
    Unqual!T value;
    ubyte[T.sizeof] array;
}

or to do some nasty casting, e.g.

ubyte[4] a = (cast(ubyte*)[0x28A].ptr)[0 .. 4];
int b = (cast(int*)a.ptr)[0];

It would be much easier to manipulate buffers (which are generally arrays of
ubytes) if casting between static arrays (and preferrably even dynamic arrays
if the indices are known at compile time) and integral values - as long as the
lengths match of course.

Worst case, something can be added to std.bitmanip to do this, but I'm a bit
surprised that that casts such as cast(ubyte[4])7 aren't allowed by the
compiler.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328


Alex R�nne Petersen <xtzgzorex gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xtzgzorex gmail.com



11:32:33 PST ---
I like the idea, but I'm a bit worried about endianness pitfalls with such a
feature...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328




PST ---
A good point. I don't know whether that's enough to make it a bad idea or not.
If you're worried about endianness though, the functions in std.bitmanip (e.g.
bigEndianToNative and nativeToBigEndian) already take care of it for you, since
they put non-native in static ubyte arrays of the appropriate type (the
conversion is dealt with internally in a union). So, maybe that in of itself
effectively solves the problem.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328


Peter Alexander <peter.alexander.au gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter.alexander.au gmail.co
                   |                            |m



13:10:29 PST ---
Why does this need to be part of the language? It is trivially implemented as a
function.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328


timon.gehr gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timon.gehr gmx.ch



So is most of the language.
It needs to be in the language because it is already there, sort of:

import std.stdio;
struct S{int x;}
void main(){
    writeln(cast(ubyte[4])S(28298298)); // ok
    // writeln(cast(ubyte[4])28298298); // ng
}

I have always considered this an inconsistency. The implementation is a trivial
rewrite.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328




PST ---
It doesn't _have_ to be, but as Timon says, it's odd that it isn't, and his
examples should that the current situation is inconsistent. I was surprised
when the cast didn't work. It seems obvious to me that it would. Maybe the
endianness issue is why it doesn't.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328




The issue is the same for structs and any programmer who performs the cast is
aware of it. (otherwise they wouldn't use a cast ;))

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328




15:16:21 PST ---

 The issue is the same for structs and any programmer who performs the cast is
 aware of it. (otherwise they wouldn't use a cast ;))

I'm am sure there are many programmers that are *not* aware of endianness, even
if they know that everything is made up of bytes and may use that cast.

I was unaware of the consistency though. Personally I consider the ability to
cast a struct to a ubyte[n] an error in the language design also. Consider:

ubyte[8] a = cast(ubyte[8]) iota(0, 8);
writeln(a);

You get [0, 0, 0, 0, 8, 0, 0, 0]

I think this is something that an inexperienced D programmer could write
expecting to get [0, 1, 2, 3, 4, 5, 6, 7] back.

Furthermore, you cannot rely on cast(ubyte[N]) to return a reinterpreted struct
because the struct may define opCast for ubyte[N] (imagine a container struct
Array(T, size_t N) that has opCast for T[N] -- casting to ubyte[Array(T,
N).sizeof] will reinterpret in most cases, except when T=ubyte and N=the
sizeof, good luck finding that bug in your generic serialisation code).

Reinterpreting memory should require nasty pointer casts. It's not common (or
safe) enough to have convenient syntax, in my opinion.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=7328






 The issue is the same for structs and any programmer who performs the cast is
 aware of it. (otherwise they wouldn't use a cast ;))

 
 I'm am sure there are many programmers that are *not* aware of endianness, even
 if they know that everything is made up of bytes and may use that cast.
 

If they are not aware of all relevant issues, they may not use a type cast.
Knowing what one does is a precondition for using a type cast.

 I was unaware of the consistency though. Personally I consider the ability to
 cast a struct to a ubyte[n] an error in the language design also.

If you didn't notice it existed, is it important enough to be called an
'error'?

 Consider:
 
 ubyte[8] a = cast(ubyte[8]) iota(0, 8);
 writeln(a);
 
 You get [0, 0, 0, 0, 8, 0, 0, 0]
 
 I think this is something that an inexperienced D programmer could write
 expecting to get [0, 1, 2, 3, 4, 5, 6, 7] back.
 

This is a constructed example. If at all, they'll cast to ubyte[] (and that
fails). But inexperienced D programmers don't use type casts. It is the first
thing they learn about type casts. If they do, it is their own fault.

 Furthermore, you cannot rely on cast(ubyte[N]) to return a reinterpreted struct
 because the struct may define opCast for ubyte[N] (imagine a container struct
 Array(T, size_t N) that has opCast for T[N] -- casting to ubyte[Array(T,
 N).sizeof] will reinterpret in most cases, except when T=ubyte and N=the
 sizeof, good luck finding that bug in your generic serialisation code).
 

Wrong. In most cases it will be a compiler error, because the compiler does not
fall back to reinterpreting if the struct defines an opCast. opCast is an
all-or-nothing thing.

 Reinterpreting memory should require nasty pointer casts. It's not common (or
 safe) enough to have convenient syntax, in my opinion.

Its only the pointer casts that are unsafe. cast(ubyte[4])1234 is perfectly
 safe. It will even catch size mismatches! (<insert 'good luck finding that
bug' comment here>)

By the way, it is possible to cast between two arbitrary structs of identical
size ;).

I would not mind if the feature was removed for structs. I'd just like to
restore consistency.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 20 2012

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - [Issue 7328] New: Allow casting between ubyte[4] and int