www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - convert ubyte[k..k + 1] to int

reply "ref2401" <refactor24 gmail.com> writes:
i have an array of ubytes. how can i convert two adjacent ubytes 
from the array to an integer?

pseudocode example:
ubyte[5] array = createArray();
int value = array[2..3];

is there any 'memcpy' method or something else to do this?
May 16 2012
next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Wed, 16 May 2012 15:24:33 +0100, ref2401 <refactor24 gmail.com> wrote:

 i have an array of ubytes. how can i convert two adjacent ubytes from  
 the array to an integer?

 pseudocode example:
 ubyte[5] array = createArray();
 int value = array[2..3];

 is there any 'memcpy' method or something else to do this?

You don't need to "copy" the data, just tell the compiler to "pretend" it's a short (in this case, for 2 bytes) then copy the value/assign to an int. e.g. import std.stdio; void main() { ubyte[5] array = [ 0xFF, 0xFF, 0x01, 0x00, 0xFF ]; int value = *cast(short*)array[2..3].ptr; writefln("Result = %s", value); } The line: int value = *cast(short*)array[2..3].ptr; 1. slices 2 bytes from the array. 2. obtains the ptr to them 3. casts the ptr to short* 4. copies the value pointed at by the short* ptr to an int You may need to worry about little/big endian issues, see: http://en.wikipedia.org/wiki/Endianness The above code outputs "Result = 1" on my little-endian x86 desktop machine but would output "Result = 256" on a big-endian machine. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
May 16 2012
parent Artur Skawina <art.08.09 gmail.com> writes:
On 05/17/12 10:15, Roman D. Boiko wrote:
 On Thursday, 17 May 2012 at 07:07:58 UTC, Roman D. Boiko wrote:
 And what about the following code:

 // This implementation is optimized for speed via swapping
 endianness in-place
 pure immutable(C)[] fixEndian(C, Endian blobEndian =
 endian)(ubyte[] blob) if(is(CharTypeOf!C))
 {
      import std.bitmanip, std.system;
      auto data = cast(C[]) blob;
      static if(blobEndian != endian)
      {
          static assert(!is(typeof(C) == char)); // UTF-8 doesn't
 have endianness
          foreach(ref ch; data) ch = swapEndian(ch);
      }
      return cast(immutable) data;
 }

I do casting from ubyte[] to C[].

Only if C.ptr ends up properly aligned. There are also aliasing issues, which i don't think are sufficiently defined for D (for C, it would be legal only because char* is allowed to alias anything). artur
May 17 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, May 16, 2012 17:03:44 Regan Heath wrote:
 On Wed, 16 May 2012 15:24:33 +0100, ref2401 <refactor24 gmail.com> wrote:
 i have an array of ubytes. how can i convert two adjacent ubytes from
 the array to an integer?
 
 pseudocode example:
 ubyte[5] array = createArray();
 int value = array[2..3];
 
 is there any 'memcpy' method or something else to do this?

You don't need to "copy" the data, just tell the compiler to "pretend" it's a short (in this case, for 2 bytes) then copy the value/assign to an int. e.g. import std.stdio; void main() { ubyte[5] array = [ 0xFF, 0xFF, 0x01, 0x00, 0xFF ]; int value = *cast(short*)array[2..3].ptr; writefln("Result = %s", value); } The line: int value = *cast(short*)array[2..3].ptr; 1. slices 2 bytes from the array. 2. obtains the ptr to them 3. casts the ptr to short* 4. copies the value pointed at by the short* ptr to an int You may need to worry about little/big endian issues, see: http://en.wikipedia.org/wiki/Endianness The above code outputs "Result = 1" on my little-endian x86 desktop machine but would output "Result = 256" on a big-endian machine.

As long as you're going from big endian to little endian, std.bitmanip.bigEndianToNative will do the conversion fairly easily, but if they're in little endian, then the nasty casting is the way to go. - Jonathan M Davis
May 16 2012
prev sibling next sibling parent "Robert DaSilva" <spunit262 yahoo.com> writes:
On Wednesday, 16 May 2012 at 18:47:55 UTC, Jonathan M Davis wrote:
 As long as you're going from big endian to little endian,
 std.bitmanip.bigEndianToNative will do the conversion fairly 
 easily, but if
 they're in little endian, then the nasty casting is the way to 
 go.

 - Jonathan M Davis

Except they don't take slices. You need a helper function. ubyte[2] _2(ubyte[] a){ubyte[2] b; assert(a.length==2); b[]=a; return b;}
May 16 2012
prev sibling next sibling parent Andrew Wiley <wiley.andrew.j gmail.com> writes:
--e89a8f23494d4fbb5b04c032ff3a
Content-Type: text/plain; charset=ISO-8859-1

On Wed, May 16, 2012 at 11:03 AM, Regan Heath <regan netmail.co.nz> wrote:

 On Wed, 16 May 2012 15:24:33 +0100, ref2401 <refactor24 gmail.com> wrote:

  i have an array of ubytes. how can i convert two adjacent ubytes from the
 array to an integer?

 pseudocode example:
 ubyte[5] array = createArray();
 int value = array[2..3];

 is there any 'memcpy' method or something else to do this?

You don't need to "copy" the data, just tell the compiler to "pretend" it's a short (in this case, for 2 bytes) then copy the value/assign to an int. e.g. import std.stdio; void main() { ubyte[5] array = [ 0xFF, 0xFF, 0x01, 0x00, 0xFF ]; int value = *cast(short*)array[2..3].ptr; writefln("Result = %s", value); } The line: int value = *cast(short*)array[2..3].ptr; 1. slices 2 bytes from the array. 2. obtains the ptr to them 3. casts the ptr to short* 4. copies the value pointed at by the short* ptr to an int You may need to worry about little/big endian issues, see: http://en.wikipedia.org/wiki/**Endianness<http://en.wikipedia.org/wiki/Endianness> The above code outputs "Result = 1" on my little-endian x86 desktop machine but would output "Result = 256" on a big-endian machine. R

rules. On x86, this will just cause a slow load from memory. On ARM, this will either crash your program with a bus error on newer hardware or give you a gibberish value on ARMv6 and older. Declaring a short, getting a pointer to it, and casting that pointer to a ubyte* to copy into it is fine, but casting a ubyte* to a short* will cause a 2-byte load from a 1-byte aligned address, which leads down the yellow brick road to pain. --e89a8f23494d4fbb5b04c032ff3a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On Wed, May 16, 2012 at 11:03 AM, Regan Heath <s= pan dir=3D"ltr">&lt;<a href=3D"mailto:regan netmail.co.nz" target=3D"_blank= ">regan netmail.co.nz</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q= uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e= x"> <div class=3D"HOEnZb"><div class=3D"h5">On Wed, 16 May 2012 15:24:33 +0100,= ref2401 &lt;<a href=3D"mailto:refactor24 gmail.com" target=3D"_blank">refa= ctor24 gmail.com</a>&gt; wrote:<br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> i have an array of ubytes. how can i convert two adjacent ubytes from the a= rray to an integer?<br> <br> pseudocode example:<br> ubyte[5] array =3D createArray();<br> int value =3D array[2..3];<br> <br> is there any &#39;memcpy&#39; method or something else to do this?<br> </blockquote> <br></div></div> You don&#39;t need to &quot;copy&quot; the data, just tell the compiler to = &quot;pretend&quot; it&#39;s a short (in this case, for 2 bytes) then copy = the value/assign to an int. e.g.<br> <br> import std.stdio;<br> <br> void main()<br> {<br> =A0 =A0 =A0 =A0ubyte[5] array =3D [ 0xFF, 0xFF, 0x01, 0x00, 0xFF ];<br> =A0 =A0 =A0 =A0int value =3D *cast(short*)array[2..3].ptr;<br> =A0 =A0 =A0 =A0writefln(&quot;Result =3D %s&quot;, value);<br> }<br> <br> The line:<br> =A0int value =3D *cast(short*)array[2..3].ptr;<br> <br> 1. slices 2 bytes from the array.<br> 2. obtains the ptr to them<br> 3. casts the ptr to short*<br> 4. copies the value pointed at by the short* ptr to an int<br> <br> You may need to worry about little/big endian issues, see:<br> <a href=3D"http://en.wikipedia.org/wiki/Endianness" target=3D"_blank">http:= //en.wikipedia.org/wiki/<u></u>Endianness</a><br> <br> The above code outputs &quot;Result =3D 1&quot; on my little-endian x86 des= ktop machine but would output &quot;Result =3D 256&quot; on a big-endian ma= chine.<span class=3D"HOEnZb"><font color=3D"#888888"><br> <br> R<br> <br></font></span></blockquote><div><br></div><div>Unfortunately, this is u= ndefined behavior because you&#39;re breaking alignment rules. On x86, this= will just cause a slow load from memory. On ARM, this will either crash yo= ur program with a bus error on newer hardware or give you a gibberish value= on ARMv6 and older.</div> <div>Declaring a short, getting a pointer to it, and casting that pointer t= o a ubyte* to copy into it is fine, but casting a ubyte* to a short* will c= ause a 2-byte load from a 1-byte aligned address, which leads down the yell= ow brick road to pain.</div> </div> --e89a8f23494d4fbb5b04c032ff3a--
May 16 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 16, 2012 at 10:25:51PM -0500, Andrew Wiley wrote:
 On Wed, May 16, 2012 at 11:03 AM, Regan Heath <regan netmail.co.nz> wrote:
 
 On Wed, 16 May 2012 15:24:33 +0100, ref2401 <refactor24 gmail.com> wrote:

  i have an array of ubytes. how can i convert two adjacent ubytes from the
 array to an integer?

 pseudocode example:
 ubyte[5] array = createArray();
 int value = array[2..3];

 is there any 'memcpy' method or something else to do this?

You don't need to "copy" the data, just tell the compiler to "pretend" it's a short (in this case, for 2 bytes) then copy the value/assign to an int. e.g. import std.stdio; void main() { ubyte[5] array = [ 0xFF, 0xFF, 0x01, 0x00, 0xFF ]; int value = *cast(short*)array[2..3].ptr; writefln("Result = %s", value); } The line: int value = *cast(short*)array[2..3].ptr; 1. slices 2 bytes from the array. 2. obtains the ptr to them 3. casts the ptr to short* 4. copies the value pointed at by the short* ptr to an int You may need to worry about little/big endian issues, see: http://en.wikipedia.org/wiki/**Endianness<http://en.wikipedia.org/wiki/Endianness> The above code outputs "Result = 1" on my little-endian x86 desktop machine but would output "Result = 256" on a big-endian machine. R

alignment rules. On x86, this will just cause a slow load from memory. On ARM, this will either crash your program with a bus error on newer hardware or give you a gibberish value on ARMv6 and older. Declaring a short, getting a pointer to it, and casting that pointer to a ubyte* to copy into it is fine, but casting a ubyte* to a short* will cause a 2-byte load from a 1-byte aligned address, which leads down the yellow brick road to pain.

Do unions suffer from this problem? Could this prevent alignment problems: short bytesToShort(ubyte[] b) in { assert(b.length==2); } body { union U { short val; ubyte[2] b; } U u; u.b[] = b[]; return u.val; } ? T -- Береги платье снову, а здоровье смолоду.
May 16 2012
prev sibling next sibling parent Andrew Wiley <wiley.andrew.j gmail.com> writes:
--f46d0408913160f60404c033b1c2
Content-Type: text/plain; charset=ISO-8859-1

On Wed, May 16, 2012 at 11:07 PM, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Wed, May 16, 2012 at 10:25:51PM -0500, Andrew Wiley wrote:
 On Wed, May 16, 2012 at 11:03 AM, Regan Heath <regan netmail.co.nz>

 On Wed, 16 May 2012 15:24:33 +0100, ref2401 <refactor24 gmail.com>


  i have an array of ubytes. how can i convert two adjacent ubytes from


 array to an integer?

 pseudocode example:
 ubyte[5] array = createArray();
 int value = array[2..3];

 is there any 'memcpy' method or something else to do this?

You don't need to "copy" the data, just tell the compiler to "pretend" it's a short (in this case, for 2 bytes) then copy the value/assign to


 int. e.g.

 import std.stdio;

 void main()
 {
        ubyte[5] array = [ 0xFF, 0xFF, 0x01, 0x00, 0xFF ];
        int value = *cast(short*)array[2..3].ptr;
        writefln("Result = %s", value);
 }

 The line:
  int value = *cast(short*)array[2..3].ptr;

 1. slices 2 bytes from the array.
 2. obtains the ptr to them
 3. casts the ptr to short*
 4. copies the value pointed at by the short* ptr to an int

 You may need to worry about little/big endian issues, see:
 http://en.wikipedia.org/wiki/**Endianness<


 The above code outputs "Result = 1" on my little-endian x86 desktop
 machine but would output "Result = 256" on a big-endian machine.

 R

alignment rules. On x86, this will just cause a slow load from memory. On ARM, this will either crash your program with a bus error on newer hardware or give you a gibberish value on ARMv6 and older. Declaring a short, getting a pointer to it, and casting that pointer to a ubyte* to copy into it is fine, but casting a ubyte* to a short* will cause a 2-byte load from a 1-byte aligned address, which leads down the yellow brick road to pain.

Do unions suffer from this problem? Could this prevent alignment problems: short bytesToShort(ubyte[] b) in { assert(b.length==2); } body { union U { short val; ubyte[2] b; } U u; u.b[] = b[]; return u.val; } ?

that the union is aligned to the maximum alignment required by one of its members, which is the short. This is probably the safest solution. --f46d0408913160f60404c033b1c2 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On Wed, May 16, 2012 at 11:07 PM, H. S. Teoh <sp= an dir=3D"ltr">&lt;<a href=3D"mailto:hsteoh quickfur.ath.cx" target=3D"_bla= nk">hsteoh quickfur.ath.cx</a>&gt;</span> wrote:<br><blockquote class=3D"gm= ail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-le= ft:1ex"> <div class=3D"im">On Wed, May 16, 2012 at 10:25:51PM -0500, Andrew Wiley wr= ote:<br> &gt; On Wed, May 16, 2012 at 11:03 AM, Regan Heath &lt;<a href=3D"mailto:re= gan netmail.co.nz">regan netmail.co.nz</a>&gt; wrote:<br> &gt;<br> &gt; &gt; On Wed, 16 May 2012 15:24:33 +0100, ref2401 &lt;<a href=3D"mailto= :refactor24 gmail.com">refactor24 gmail.com</a>&gt; wrote:<br> &gt; &gt;<br> &gt; &gt; =A0i have an array of ubytes. how can i convert two adjacent ubyt= es from the<br> &gt; &gt;&gt; array to an integer?<br> &gt; &gt;&gt;<br> &gt; &gt;&gt; pseudocode example:<br> &gt; &gt;&gt; ubyte[5] array =3D createArray();<br> &gt; &gt;&gt; int value =3D array[2..3];<br> &gt; &gt;&gt;<br> &gt; &gt;&gt; is there any &#39;memcpy&#39; method or something else to do = this?<br> &gt; &gt;&gt;<br> &gt; &gt;<br> &gt; &gt; You don&#39;t need to &quot;copy&quot; the data, just tell the co= mpiler to &quot;pretend&quot;<br> &gt; &gt; it&#39;s a short (in this case, for 2 bytes) then copy the value/= assign to an<br> &gt; &gt; int. e.g.<br> &gt; &gt;<br> &gt; &gt; import std.stdio;<br> &gt; &gt;<br> &gt; &gt; void main()<br> &gt; &gt; {<br> &gt; &gt; =A0 =A0 =A0 =A0ubyte[5] array =3D [ 0xFF, 0xFF, 0x01, 0x00, 0xFF = ];<br> &gt; &gt; =A0 =A0 =A0 =A0int value =3D *cast(short*)array[2..3].ptr;<br> &gt; &gt; =A0 =A0 =A0 =A0writefln(&quot;Result =3D %s&quot;, value);<br> &gt; &gt; }<br> &gt; &gt;<br> &gt; &gt; The line:<br> &gt; &gt; =A0int value =3D *cast(short*)array[2..3].ptr;<br> &gt; &gt;<br> &gt; &gt; 1. slices 2 bytes from the array.<br> &gt; &gt; 2. obtains the ptr to them<br> &gt; &gt; 3. casts the ptr to short*<br> &gt; &gt; 4. copies the value pointed at by the short* ptr to an int<br> &gt; &gt;<br> &gt; &gt; You may need to worry about little/big endian issues, see:<br> </div>&gt; &gt; <a href=3D"http://en.wikipedia.org/wiki/**Endianness" targe= t=3D"_blank">http://en.wikipedia.org/wiki/**Endianness</a>&lt;<a href=3D"ht= tp://en.wikipedia.org/wiki/Endianness" target=3D"_blank">http://en.wikipedi= a.org/wiki/Endianness</a>&gt;<br> <div class=3D"im">&gt; &gt;<br> &gt; &gt; The above code outputs &quot;Result =3D 1&quot; on my little-endi= an x86 desktop<br> &gt; &gt; machine but would output &quot;Result =3D 256&quot; on a big-endi= an machine.<br> &gt; &gt;<br> &gt; &gt; R<br> &gt; &gt;<br> &gt; &gt;<br> &gt; Unfortunately, this is undefined behavior because you&#39;re breaking<= br> &gt; alignment rules. On x86, this will just cause a slow load from memory.= <br> &gt; On ARM, this will either crash your program with a bus error on newer<= br> &gt; hardware or give you a gibberish value on ARMv6 and older.<br> &gt; Declaring a short, getting a pointer to it, and casting that pointer<b= r> &gt; to a ubyte* to copy into it is fine, but casting a ubyte* to a short*<= br> &gt; will cause a 2-byte load from a 1-byte aligned address, which leads<br=

<br> </div>Do unions suffer from this problem? Could this prevent alignment<br> problems:<br> <br> =A0 =A0 =A0 =A0short bytesToShort(ubyte[] b)<br> =A0 =A0 =A0 =A0in { assert(b.length=3D=3D2); }<br> =A0 =A0 =A0 =A0body {<br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0union U {<br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0short val;<br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ubyte[2] b;<br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}<br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0U u;<br> <br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0u.b[] =3D b[];<br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return u.val;<br> =A0 =A0 =A0 =A0}<br> <br> ?<br> <span class=3D"HOEnZb"><font color=3D"#888888"> <br></font></span></blockquote><div><br></div><div>As I understand it, this= should be fine because the compiler will guarantee that the union is align= ed to the maximum alignment required by one of its members, which is the sh= ort. This is probably the safest solution.=A0</div> </div><br> --f46d0408913160f60404c033b1c2--
May 16 2012
prev sibling next sibling parent "Roman D. Boiko" <rb d-coding.com> writes:
On Thursday, 17 May 2012 at 04:16:10 UTC, Andrew Wiley wrote:
 On Wed, May 16, 2012 at 11:07 PM, H. S. Teoh 
 <hsteoh quickfur.ath.cx> wrote:
 Do unions suffer from this problem? Could this prevent 
 alignment
 problems:

        short bytesToShort(ubyte[] b)
        in { assert(b.length==2); }
        body {
                union U {
                        short val;
                        ubyte[2] b;
                }
                U u;

                u.b[] = b[];
                return u.val;
        }

 ?

will guarantee that the union is aligned to the maximum alignment required by one of its members, which is the short. This is probably the safest solution.

And what about the following code: // This implementation is optimized for speed via swapping endianness in-place pure immutable(C)[] fixEndian(C, Endian blobEndian = endian)(ubyte[] blob) if(is(CharTypeOf!C)) { import std.bitmanip, std.system; auto data = cast(C[]) blob; static if(blobEndian != endian) { static assert(!is(typeof(C) == char)); // UTF-8 doesn't have endianness foreach(ref ch; data) ch = swapEndian(ch); } return cast(immutable) data; }
May 17 2012
prev sibling next sibling parent "Roman D. Boiko" <rb d-coding.com> writes:
On Thursday, 17 May 2012 at 07:07:58 UTC, Roman D. Boiko wrote:
 And what about the following code:

 // This implementation is optimized for speed via swapping
 endianness in-place
 pure immutable(C)[] fixEndian(C, Endian blobEndian =
 endian)(ubyte[] blob) if(is(CharTypeOf!C))
 {
      import std.bitmanip, std.system;
      auto data = cast(C[]) blob;
      static if(blobEndian != endian)
      {
          static assert(!is(typeof(C) == char)); // UTF-8 doesn't
 have endianness
          foreach(ref ch; data) ch = swapEndian(ch);
      }
      return cast(immutable) data;
 }

and its length is a multiple of C.sizeof)? I do casting from ubyte[] to C[].
May 17 2012
prev sibling next sibling parent "Roman D. Boiko" <rb d-coding.com> writes:
On Thursday, 17 May 2012 at 08:39:21 UTC, Artur Skawina wrote:
 On 05/17/12 10:15, Roman D. Boiko wrote:
 I mean, is it safe (assuming that we are allowed to mutate 
 blob, and its length is a multiple of C.sizeof)?
 
 I do casting from ubyte[] to C[].

Only if C.ptr ends up properly aligned. There are also aliasing issues, which i don't think are sufficiently defined for D (for C, it would be legal only because char* is allowed to alias anything). artur

Is it possible to ensure? In my case blob is created as auto blob = cast(ubyte[]) read(fileName); I assume that alignment is safe. But what should I do to be safe in a general case?
May 17 2012
prev sibling parent Luis Panadero =?UTF-8?B?R3VhcmRlw7Fv?= <luis.panadero gmail.com> writes:
ref2401 wrote:

 i have an array of ubytes. how can i convert two adjacent ubytes
 from the array to an integer?
 
 pseudocode example:
 ubyte[5] array = createArray();
 int value = array[2..3];
 
 is there any 'memcpy' method or something else to do this?

Try to use littleEndianToNative or bigEndianToNative, but you should check the endianes of the data that you desire convert to a int. For example: ubyte[5] array = createArray(); int value = littleEndianToNative!int(array); // or: // int value = bigEndianToNative!int(array); -- I'm afraid that I have a blog: http://zardoz.es
May 21 2012