www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - struct to byte[]

reply =?ISO-8859-1?Q?Lu=EDs_Marques?= <luismarques+spam gmail.com> writes:
Hello,

Converting a structure to ubyte[] or similar is something that I have 
been doing frequently while converting C to D code.

Can we have a "cast(type[])" working for structures, please?

I suppose that type should include at least ubyte, byte and char, if not 
all the basic data types.

Is there a reason for this explicit cast not to work automatically? (I 
know that templates can alleviate this)

E.g.

struct Foo
{
     int a;
     float b;
}

converts to ubyte[8] via "cast(ubyte[])".


--
Luís
Dec 12 2006
next sibling parent reply Kirk McDonald <kirklin.mcdonald gmail.com> writes:
Luís Marques wrote:
 Hello,
 
 Converting a structure to ubyte[] or similar is something that I have 
 been doing frequently while converting C to D code.
 
 Can we have a "cast(type[])" working for structures, please?
 
 I suppose that type should include at least ubyte, byte and char, if not 
 all the basic data types.
 
 Is there a reason for this explicit cast not to work automatically? (I 
 know that templates can alleviate this)
 
 E.g.
 
 struct Foo
 {
     int a;
     float b;
 }
 
 converts to ubyte[8] via "cast(ubyte[])".
 
 
 -- 
 Luís

Foo f; ubyte[] u = (cast(ubyte*)&f)[0 .. Foo.sizeof]; -- Kirk McDonald Pyd: Wrapping Python with D http://pyd.dsource.org
Dec 12 2006
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Kirk McDonald wrote:
 Luís Marques wrote:
 Can we have a "cast(type[])" working for structures, please?


 E.g.

 struct Foo
 {
     int a;
     float b;
 }

 converts to ubyte[8] via "cast(ubyte[])".

Foo f; ubyte[] u = (cast(ubyte*)&f)[0 .. Foo.sizeof];

Or: Foo f; ubyte[] u = cast(ubyte[])(&f[0..1]); This works more cleanly with types of size > 1 as well, but IIRC basically asserts[1] (Foo.sizeof % T.sizeof == 0). So don't use this code if you're not sure it will fit exactly into a T[]. [1]: Statically? Don't recall at the moment...
Dec 12 2006
parent reply =?ISO-8859-1?Q?Lu=EDs_Marques?= <luismarques+spam gmail.com> writes:
Frits van Bommel wrote:
 Kirk McDonald wrote:
 Foo f;
 ubyte[] u = (cast(ubyte*)&f)[0 .. Foo.sizeof];

Or: Foo f; ubyte[] u = cast(ubyte[])(&f[0..1]); This works more cleanly with types of size > 1 as well, but IIRC basically asserts[1] (Foo.sizeof % T.sizeof == 0). So don't use this code if you're not sure it will fit exactly into a T[]. [1]: Statically? Don't recall at the moment...

I have been using Kirk's version. Frits' variant is interesting (thanks!). Still, is there a reason not to allow a direct cast to ubyte[]? I don't see much of a problem and I suppose it wouldn't be much work (no?). It seems so much cleaner and readable to just cast it directly.
Dec 12 2006
next sibling parent Gregor Richards <Richards codu.org> writes:
Luís Marques wrote:
 Frits van Bommel wrote:
 
 Kirk McDonald wrote:

 Foo f;
 ubyte[] u = (cast(ubyte*)&f)[0 .. Foo.sizeof];

Or: Foo f; ubyte[] u = cast(ubyte[])(&f[0..1]); This works more cleanly with types of size > 1 as well, but IIRC basically asserts[1] (Foo.sizeof % T.sizeof == 0). So don't use this code if you're not sure it will fit exactly into a T[]. [1]: Statically? Don't recall at the moment...

I have been using Kirk's version. Frits' variant is interesting (thanks!). Still, is there a reason not to allow a direct cast to ubyte[]? I don't see much of a problem and I suppose it wouldn't be much work (no?). It seems so much cleaner and readable to just cast it directly.

I agree. There are plenty of ways to cast it, but they're fairly arcane. This is useful enough, and so long as it's explicit I don't think it's at all confusing. Perhaps as an alternative (idea stolen from #d), some_struct.bytearray as an implicit property. - Gregor Richards
Dec 12 2006
prev sibling parent reply John Demme <me teqdruid.com> writes:
Luís Marques wrote:

 Frits van Bommel wrote:
 Kirk McDonald wrote:
 Foo f;
 ubyte[] u = (cast(ubyte*)&f)[0 .. Foo.sizeof];

Or: Foo f; ubyte[] u = cast(ubyte[])(&f[0..1]); This works more cleanly with types of size > 1 as well, but IIRC basically asserts[1] (Foo.sizeof % T.sizeof == 0). So don't use this code if you're not sure it will fit exactly into a T[]. [1]: Statically? Don't recall at the moment...

I have been using Kirk's version. Frits' variant is interesting (thanks!). Still, is there a reason not to allow a direct cast to ubyte[]? I don't see much of a problem and I suppose it wouldn't be much work (no?). It seems so much cleaner and readable to just cast it directly.

The reason Foo cannot be casted to ubyte is because they are different sizes. ubyte[] is a pointer and a length, whereas Foo is anything. It makes more sense to be able to be able to cast from a Foo* to ubyte, thus the & in Kirk's code. This, however, will only fill the pointer part of ubyte[]- it still can't know the length. The [0 .. Foo.sizeof] bit gives the dynamic array the length. If you're looking for an operator to convert a arbitrary piece of data to an array of bytes, a cast is not what you're looking for. I agree this would be useful, though... A template in phobos would be nice. -- ~John Demme me teqdruid.com http://www.teqdruid.com/
Dec 12 2006
parent =?ISO-8859-1?Q?Lu=EDs_Marques?= <luismarques+spam gmail.com> writes:
John Demme wrote:
 The reason Foo cannot be casted to ubyte is because they are different
 sizes.  ubyte[] is a pointer and a length, whereas Foo is anything.  It
 makes more sense to be able to be able to cast from a Foo* to ubyte, thus
 the & in Kirk's code.  This, however, will only fill the pointer part of
 ubyte[]- it still can't know the length.  The [0 .. Foo.sizeof] bit gives
 the dynamic array the length.
 
 If you're looking for an operator to convert a arbitrary piece of data to an
 array of bytes, a cast is not what you're looking for.  I agree this would
 be useful, though... A template in phobos would be nice.

Foo, being a struct, has an address and a .sizeof, giving us a pointer and a length. Why then type it explicitly, when the compiler has all the information and the cast is explicit, unambiguous and readable?
Dec 13 2006
prev sibling next sibling parent reply Alexander Panek <a.panek brainsware.org> writes:
ubyte [] toByteArray (T) (T t) {
	return (cast(ubyte *)&t)[0..T.sizeof].dup
}
 Is there a reason for this explicit cast not to work automatically? (I 
 know that templates can alleviate this)
 
 E.g.
 
 struct Foo
 {
     int a;
     float b;
 }

Foo f; auto b = toByteArray!(Foo)(f); // Yay! Hope that helps. :P
 
 -- 
 Luís

Dec 13 2006
parent reply novice2 <sorry noem.ail> writes:
== Quote from Alexander Panek (a.panek brainsware.org)'s article
 ubyte [] toByteArray (T) (T t) {
 	return (cast(ubyte *)&t)[0..T.sizeof].dup
 }

 auto b = toByteArray!(Foo)(f); // Yay!

why we need two parameters? compiler don't know type of f?
Dec 13 2006
next sibling parent Alexander Panek <a.panek brainsware.org> writes:
novice2 wrote:
 [...]
 auto b = toByteArray!(Foo)(f); // Yay!

why we need two parameters? compiler don't know type of f?

That's how templates work. Another option would be to alias for frequently used types. alias toByteArray!(Foo) FooToByteArray; // or similar
Dec 13 2006
prev sibling parent reply Endea <notknown none.com> writes:
novice2 kirjoitti:
 == Quote from Alexander Panek (a.panek brainsware.org)'s article
 ubyte [] toByteArray (T) (T t) {
 	return (cast(ubyte *)&t)[0..T.sizeof].dup
 }

 auto b = toByteArray!(Foo)(f); // Yay!

why we need two parameters? compiler don't know type of f?

This works: auto b = toByteArray(f); // Yay!
Dec 13 2006
parent reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Endea wrote:
 novice2 kirjoitti:
 == Quote from Alexander Panek (a.panek brainsware.org)'s article
 ubyte [] toByteArray (T) (T t) {
     return (cast(ubyte *)&t)[0..T.sizeof].dup
 }

 auto b = toByteArray!(Foo)(f); // Yay!

why we need two parameters? compiler don't know type of f?

This works: auto b = toByteArray(f); // Yay!

Huzzah for IFTI! -- Chris Nicholson-Sauls
Dec 13 2006
next sibling parent Alexander Panek <a.panek brainsware.org> writes:
 This works:

 auto b = toByteArray(f); // Yay!

Huzzah for IFTI!

Huuh. :o Didn't know that'd work, actually.
Dec 13 2006
prev sibling parent reply Derek Parnell <derek nomail.afraid.org> writes:
On Wed, 13 Dec 2006 14:22:32 -0600, Chris Nicholson-Sauls wrote:

 Endea wrote:
 novice2 kirjoitti:
 == Quote from Alexander Panek (a.panek brainsware.org)'s article
 ubyte [] toByteArray (T) (T t) {
     return (cast(ubyte *)&t)[0..T.sizeof].dup
 }

 auto b = toByteArray!(Foo)(f); // Yay!

why we need two parameters? compiler don't know type of f?

This works: auto b = toByteArray(f); // Yay!

Huzzah for IFTI!

Yes, but unfortunately the actual function is faulty. Here is what I had to do to get it to work ... ubyte [] toByteArray (T) (inout T t) { union ubyte_abi { ubyte[] x; struct { uint xl; // length void* xp; // ptr } } ubyte_abi res; res.xp = cast(void*)&t; res.xl = T.sizeof; return res.x; } unittest { struct Foo_uni { int a; real b; char[4] c; dchar[] d; } Foo_uni f; ubyte[] b; b = toByteArray(f); assert(b.length == f.sizeof); assert(cast(void*)(b.ptr) == cast(void *)&f); real c; b = toByteArray(c); assert(b.length == c.sizeof); assert(cast(void*)(b.ptr) == cast(void *)&c); class Bar_uni { int a; real b; char[4] c; dchar[] d; } Bar_uni g = new Bar_uni; b = toByteArray(g.d); assert(b.length == g.d.sizeof); assert(cast(void*)(b.ptr) == cast(void *)&g.d); } -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocrity!" 14/12/2006 10:45:46 AM
Dec 13 2006
next sibling parent reply BCS <BCS pathlink.com> writes:
Derek Parnell wrote:
 
 
 Yes, but unfortunately the actual function is faulty. Here is what I had to
 do to get it to work ...
 
 ubyte [] toByteArray (T) (inout T t)
 {
     union ubyte_abi
     {
         ubyte[] x;
         struct
         {
            uint  xl; // length
            void* xp; // ptr
         }
     }
     ubyte_abi res;
     res.xp = cast(void*)&t;
     res.xl = T.sizeof;
     return res.x;
 }
 

What didn't work??? Unless arrays are broken, that should be the same.;
Dec 13 2006
parent reply Derek Parnell <derek nomail.afraid.org> writes:
On Wed, 13 Dec 2006 20:48:43 -0800, BCS wrote:

 Derek Parnell wrote:
 
 Yes, but unfortunately the actual function is faulty. Here is what I had to
 do to get it to work ...
 
 ubyte [] toByteArray (T) (inout T t)
 {
     union ubyte_abi
     {
         ubyte[] x;
         struct
         {
            uint  xl; // length
            void* xp; // ptr
         }
     }
     ubyte_abi res;
     res.xp = cast(void*)&t;
     res.xl = T.sizeof;
     return res.x;
 }
 

What didn't work??? Unless arrays are broken, that should be the same.;

The original function was ubyte [] toByteArray (T) (T t) { return (cast(ubyte *)&t)[0..T.sizeof].dup } With this, the '&t' phrase takes the address of the data as passed-by-value. Which means that when passing a struct or basic type, you get the address of the copy of the data which of course is not in scope when you return from the function. The '.dup' takes another copy of the data (this time on the heap) and you return the ubyte[] reference to the second copy. So you end up with a ubyte[] array that references a copy of the struct/data you supplied to the function and doesn't reference the initial data at all. Of course, that might be what you are trying to do ;-) I just thought that what was being attempted was to have a ubyte[] to access the data in the original struct instance and not a copy of it. My function avoids taking copies of the data and returns a ubyte[] reference to the actual struct/data instance passed to the function. Both are invoked with identical syntax thanks to IFTI and D's handling of inout parameters. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocrity!" 14/12/2006 4:42:50 PM
Dec 13 2006
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Derek Parnell wrote:
 On Wed, 13 Dec 2006 20:48:43 -0800, BCS wrote:
 
 Derek Parnell wrote:
 Yes, but unfortunately the actual function is faulty. Here is what I had to
 do to get it to work ...

 ubyte [] toByteArray (T) (inout T t)
 {
     union ubyte_abi
     {
         ubyte[] x;
         struct
         {
            uint  xl; // length
            void* xp; // ptr
         }
     }
     ubyte_abi res;
     res.xp = cast(void*)&t;
     res.xl = T.sizeof;
     return res.x;
 }

Unless arrays are broken, that should be the same.;

The original function was ubyte [] toByteArray (T) (T t) { return (cast(ubyte *)&t)[0..T.sizeof].dup } With this, the '&t' phrase takes the address of the data as passed-by-value. Which means that when passing a struct or basic type, you get the address of the copy of the data which of course is not in scope when you return from the function. The '.dup' takes another copy of the data (this time on the heap) and you return the ubyte[] reference to the second copy. So you end up with a ubyte[] array that references a copy of the struct/data you supplied to the function and doesn't reference the initial data at all. Of course, that might be what you are trying to do ;-) I just thought that what was being attempted was to have a ubyte[] to access the data in the original struct instance and not a copy of it. My function avoids taking copies of the data and returns a ubyte[] reference to the actual struct/data instance passed to the function. Both are invoked with identical syntax thanks to IFTI and D's handling of inout parameters.

But why did you need the union/struct trick? And with this, isn't the compiler free to rearrangre struct fields?? union ubyte_abi { ubyte[] x; struct //isn't the compiler free to rearrangre struct fields?? { uint xl; // length void* xp; // ptr } }
Dec 13 2006
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Hasan Aljudy wrote:
 

 And with this, isn't the compiler free to rearrangre struct fields??

Nope. See http://www.digitalmars.com/d/class.html under "Fields": "Explicit control of field layout is provided by struct/union types, not classes." --bb
Dec 14 2006
prev sibling parent reply BCS <fromnowhere pathlink.com> writes:
== Quote from Derek Parnell (derek nomail.afraid.org)'s article
 With this, the '&t' phrase takes the address of the data as
 passed-by-value. [...] So you end up with a ubyte[]
 array that references a copy of the struct/data you supplied
 [...] Of course, that might be what you are trying to do ;-)

I get it now! The kind of things I would use it for would only care about the copy from a performace standpoint. OTOH: would this work? ubyte [] toByteArray (T) (inout T t) { return (cast(ubyte *)&t)[0..T.sizeof].dup }
Dec 14 2006
parent Derek Parnell <derek nomail.afraid.org> writes:
On Thu, 14 Dec 2006 18:10:56 +0000 (UTC), BCS wrote:

 == Quote from Derek Parnell (derek nomail.afraid.org)'s article
 With this, the '&t' phrase takes the address of the data as
 passed-by-value. [...] So you end up with a ubyte[]
 array that references a copy of the struct/data you supplied
 [...] Of course, that might be what you are trying to do ;-)

I get it now! The kind of things I would use it for would only care about the copy from a performace standpoint. OTOH: would this work? ubyte [] toByteArray (T) (inout T t) { return (cast(ubyte *)&t)[0..T.sizeof].dup }

LOL ... That is just *so* much better than my complex method. If you remove the '.dup', which gets the exact effect of my routine, your generated machine code is very optimized. Your version: mov EDX,EAX mov EAX,01Ch ret My version: enter 8,0 push EBX mov EDX,__init...ubyte_abi[04h] mov EBX,__init...ubyte_abi mov -8[EBP],EBX mov -4[EBP],EDX mov -4[EBP],EAX mov dword ptr -8[EBP],01Ch mov EDX,-4[EBP] mov EAX,-8[EBP] pop EBX leave ret So I'll stick to this now ... ubyte [] toByteArray (T) (inout T t) { return (cast(ubyte *)&t)[0..T.sizeof]; } -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocrity!" 15/12/2006 12:00:30 PM
Dec 14 2006
prev sibling parent =?ISO-8859-1?Q?Lu=EDs_Marques?= <luismarques+spam gmail.com> writes:
Derek Parnell wrote:
 Yes, but unfortunately the actual function is faulty. Here is what I had to
 do to get it to work ...
 
 ubyte [] toByteArray (T) (inout T t)
 {
     union ubyte_abi
     {
         ubyte[] x;
         struct
         {
            uint  xl; // length
            void* xp; // ptr
         }
     }
     ubyte_abi res;
     res.xp = cast(void*)&t;
     res.xl = T.sizeof;
     return res.x;
 }
 
 unittest
 {
    struct Foo_uni
    {
       int a;
       real b;
       char[4] c;
       dchar[] d;
    }
    Foo_uni f;
    ubyte[] b;
 
    b = toByteArray(f);
 
    assert(b.length == f.sizeof);
    assert(cast(void*)(b.ptr) == cast(void *)&f);
 
    real c;
    b = toByteArray(c);
    assert(b.length == c.sizeof);
    assert(cast(void*)(b.ptr) == cast(void *)&c);
 
    class Bar_uni
    {
       int a;
       real b;
       char[4] c;
       dchar[] d;
    }
    Bar_uni g = new Bar_uni;
 
    b = toByteArray(g.d);
    assert(b.length == g.d.sizeof);
    assert(cast(void*)(b.ptr) == cast(void *)&g.d);
 
 }

Amazing. Thanks. Yet, by this time, don't you think it would be best if the compiler worked out the cast to ubyte[] on its own?
Dec 14 2006
prev sibling parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
Luís Marques wrote:
 Hello,
 
 Converting a structure to ubyte[] or similar is something that I have 
 been doing frequently while converting C to D code.
 
 Can we have a "cast(type[])" working for structures, please?
 
 I suppose that type should include at least ubyte, byte and char, if not 
 all the basic data types.

Be careful with char: D char type is not a byte, it's a UTF-8 code unit. Don't try to use char[] for storing raw data.
Dec 13 2006