digitalmars.D.learn - Convert some ints into a byte array without allocations?

Samson Smith (11/12) Jan 16 2016 I'm trying to make a fast little function that'll give me a

Yazan D (5/21) Jan 16 2016 You can do this:

Yazan D (13/19) Jan 16 2016 You can also use a union:

tsbockman (3/15) Jan 16 2016 I sure hope it's not undefined behaviour in D, seeing as this

bearophile (6/7) Jan 16 2016 Better to use the actual size:

Samson Smith (4/11) Jan 16 2016 Good thinking, I won't have to change it around if I change the

Samson Smith (2/8) Jan 16 2016 This seems to work. Thankyou!

Johannes Pfau (12/24) Jan 16 2016 You need to be careful with that code though. As you're taking the

Jonathan M Davis via Digitalmars-d-learn (23/35) Jan 16 2016 For this particular case, since you're hashing rather than doing somethi...

Samson Smith (27/72) Jan 16 2016 If I'm hoping to have my hash come out the same on both bigendian

Johannes Pfau (23/64) Jan 16 2016 If you use the simple pointer cast you will end up with different byte

Samson Smith <fsdf dsfd.com> writes:

I'm trying to make a fast little function that'll give me a 
random looking (but deterministic) value from an x,y position on 
a grid. I'm just going to run each co-ord that I need through an 
FNV-1a hash function as an array of bytes since that seems like a 
fast and easy way to go. I'm going to need to do this a lot and 
quickly for a real time application so I don't want to waste a 
lot of cycles converting data or allocating space for an array.

In a nutshell how do I cast an int into a byte array?

I tried this:

byte[] bytes = cast(byte[])x;
 Error: cannot cast expression x of type int to byte[]

What should I be doing instead?

Jan 16 2016

Yazan D <invalid email.com> writes:

On Sat, 16 Jan 2016 14:34:54 +0000, Samson Smith wrote:

 I'm trying to make a fast little function that'll give me a random
 looking (but deterministic) value from an x,y position on a grid. I'm
 just going to run each co-ord that I need through an FNV-1a hash
 function as an array of bytes since that seems like a fast and easy way
 to go. I'm going to need to do this a lot and quickly for a real time
 application so I don't want to waste a lot of cycles converting data or
 allocating space for an array.
 
 In a nutshell how do I cast an int into a byte array?
 
 I tried this:
 
 byte[] bytes = cast(byte[])x;
 Error: cannot cast expression x of type int to byte[]

 
 What should I be doing instead?

You can do this:
ubyte[] b = (cast(ubyte*) &a)[0 .. int.sizeof];

It is casting the pointer to `a` to a ubyte (or byte) pointer and then 
taking a slice the size of int.

Jan 16 2016

Yazan D <invalid email.com> writes:

On Sat, 16 Jan 2016 14:42:27 +0000, Yazan D wrote:
 
 You can do this:
 ubyte[] b = (cast(ubyte*) &a)[0 .. int.sizeof];
 
 It is casting the pointer to `a` to a ubyte (or byte) pointer and then
 taking a slice the size of int.

You can also use a union:

union Foo
{
  int i;
  ubyte[4] b;
}

// write to int part
Foo f = Foo(a);
// then read from ubyte part
writeln(foo.b);

ps. I am not sure of the aliasing rules in D for unions. In C, this is 
allowed, but in C++, this is undefined behaviour AFAIK.

Jan 16 2016

tsbockman <thomas.bockman gmail.com> writes:

On Saturday, 16 January 2016 at 14:46:47 UTC, Yazan D wrote:
 You can also use a union:

 union Foo
 {
   int i;
   ubyte[4] b;
 }

 // write to int part
 Foo f = Foo(a);
 // then read from ubyte part
 writeln(foo.b);

 ps. I am not sure of the aliasing rules in D for unions. In C, 
 this is allowed, but in C++, this is undefined behaviour AFAIK.

I sure hope it's not undefined behaviour in D, seeing as this 
technique is used several places in the standard library.

Jan 16 2016

bearophile <bearophileHUGS lycos.com> writes:

Yazan D:

On Saturday, 16 January 2016 at 14:42:27 UTC, Yazan D wrote:
 ubyte[] b = (cast(ubyte*) &a)[0 .. int.sizeof];

Better to use the actual size:

ubyte[] b = (cast(ubyte*) &a)[0 .. a.sizeof];

Bye,
bearophile

Jan 16 2016

Samson Smith <fsdf dsfd.com> writes:

On Saturday, 16 January 2016 at 15:42:39 UTC, bearophile wrote:
 Yazan D:

 On Saturday, 16 January 2016 at 14:42:27 UTC, Yazan D wrote:
 ubyte[] b = (cast(ubyte*) &a)[0 .. int.sizeof];

 Better to use the actual size:

 ubyte[] b = (cast(ubyte*) &a)[0 .. a.sizeof];

 Bye,
 bearophile

Good thinking, I won't have to change it around if I change the 
type of my co-ords later.

Thanks :)

Jan 16 2016

Samson Smith <fsdf dsfd.com> writes:

On Saturday, 16 January 2016 at 14:42:27 UTC, Yazan D wrote:
 On Sat, 16 Jan 2016 14:34:54 +0000, Samson Smith wrote:

 [...]

 You can do this:
 ubyte[] b = (cast(ubyte*) &a)[0 .. int.sizeof];

 It is casting the pointer to `a` to a ubyte (or byte) pointer 
 and then taking a slice the size of int.

This seems to work. Thankyou!

Jan 16 2016

Johannes Pfau <nospam example.com> writes:

Am Sat, 16 Jan 2016 15:46:00 +0000
schrieb Samson Smith <fsdf dsfd.com>:

 On Saturday, 16 January 2016 at 14:42:27 UTC, Yazan D wrote:
 On Sat, 16 Jan 2016 14:34:54 +0000, Samson Smith wrote:
  
 [...]  

 You can do this:
 ubyte[] b = (cast(ubyte*) &a)[0 .. int.sizeof];

 It is casting the pointer to `a` to a ubyte (or byte) pointer 
 and then taking a slice the size of int.  

 
 This seems to work. Thankyou!

You need to be careful with that code though. As you're taking the
address of the a variable, b.ptr will point to a. If a is on the stack
you must make sure you do not escape the b reference.


Another option is using static arrays:
ubyte[a.sizeof] b = *(cast(ubyte[a.sizeof]*)&a);

Static arrays are value types. Whenever you pass b to a function it's
copied and you don't have to worry about the lifetime of a.

This pointer cast (int => ubyte[4]) is safe, but the inverse operation,
casting from ubyte[4] to int, is not safe. For the inverse operation
you'd have to use unions as shown in Yazans response.

Jan 16 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Saturday, January 16, 2016 14:34:54 Samson Smith via Digitalmars-d-learn
wrote:
 I'm trying to make a fast little function that'll give me a
 random looking (but deterministic) value from an x,y position on
 a grid. I'm just going to run each co-ord that I need through an
 FNV-1a hash function as an array of bytes since that seems like a
 fast and easy way to go. I'm going to need to do this a lot and
 quickly for a real time application so I don't want to waste a
 lot of cycles converting data or allocating space for an array.

 In a nutshell how do I cast an int into a byte array?

 I tried this:

 byte[] bytes = cast(byte[])x;
 Error: cannot cast expression x of type int to byte[]

 What should I be doing instead?

For this particular case, since you're hashing rather than doing something
like putting the resulting value on the wire, the cast that others suggested
may very well be the way to go, but the typesafe way to do the conversion
would be to use std.bitmanip.

    int i = 12345;
    auto arr = nativeToBigEndian(i);

where the result is ubyte[4], because the argument was an int. If it had
been a long, it would have been ubyte[8]. So, you avoid bugs where you get
the sizes wrong. The only reason that I can think of to _not_ do this in
your case would be speed, simply because you don't care about swapping the
endianness like you would when sending the data via a socket or whatnot. Of
course, if you knew that you were always going to be on little endian
machines, you could also use nativeToLittleEndian to avoid the swap, though
that still might be slower than a simple cast depending on the optimizer (it
uses a union internally).

But it will be less error-prone to use those functions, and if you _do_
actually need to swap endianness, then they're exactly what you should be
using. We've had cases that have come up where using those functions
prevented bugs precisely because the person writing the code got the sizes
wrong (and the compiler complained, since nativeToBigEndian and friends
deal with the sizes in a typesafe manner).

- Jonathan M Davis

Jan 16 2016

Samson Smith <fsdf dsfd.com> writes:

On Saturday, 16 January 2016 at 16:28:21 UTC, Jonathan M Davis 
wrote:
 On Saturday, January 16, 2016 14:34:54 Samson Smith via 
 Digitalmars-d-learn wrote:
 I'm trying to make a fast little function that'll give me a 
 random looking (but deterministic) value from an x,y position 
 on a grid. I'm just going to run each co-ord that I need 
 through an FNV-1a hash function as an array of bytes since 
 that seems like a fast and easy way to go. I'm going to need 
 to do this a lot and quickly for a real time application so I 
 don't want to waste a lot of cycles converting data or 
 allocating space for an array.

 In a nutshell how do I cast an int into a byte array?

 I tried this:

 byte[] bytes = cast(byte[])x;
 Error: cannot cast expression x of type int to byte[]

 What should I be doing instead?

 For this particular case, since you're hashing rather than 
 doing something like putting the resulting value on the wire, 
 the cast that others suggested may very well be the way to go, 
 but the typesafe way to do the conversion would be to use 
 std.bitmanip.

     int i = 12345;
     auto arr = nativeToBigEndian(i);

 where the result is ubyte[4], because the argument was an int. 
 If it had been a long, it would have been ubyte[8]. So, you 
 avoid bugs where you get the sizes wrong. The only reason that 
 I can think of to _not_ do this in your case would be speed, 
 simply because you don't care about swapping the endianness 
 like you would when sending the data via a socket or whatnot. 
 Of course, if you knew that you were always going to be on 
 little endian machines, you could also use nativeToLittleEndian 
 to avoid the swap, though that still might be slower than a 
 simple cast depending on the optimizer (it uses a union 
 internally).

 But it will be less error-prone to use those functions, and if 
 you _do_ actually need to swap endianness, then they're exactly 
 what you should be using. We've had cases that have come up 
 where using those functions prevented bugs precisely because 
 the person writing the code got the sizes wrong (and the 
 compiler complained, since nativeToBigEndian and friends deal 
 with the sizes in a typesafe manner).

 - Jonathan M Davis

If I'm hoping to have my hash come out the same on both bigendian 
and littleendian machines but not send the results between 
machines, should I take these precautions? I want one machine to 
send the other a seed (in an endian safe way) and have both 
machines generate the same hashes.

Here's the relevant code:

uint coordHash(int x, int y, uint seed){
	seed = FNV1a((cast(ubyte*) &x)[0 .. x.sizeof], seed);
	return FNV1a((cast(ubyte*) &y)[0 .. y.sizeof], seed);
}
// Byte order matters for the below function
uint FNV1a(ubyte[] bytes, uint code){
	for(int iii = 0; iii < bytes.length; ++iii){
         	code ^= bytes[iii];
         	code *= FNV_PRIME_32;
	}
	return code;
}

Am I going to get the same outcome on all machines or would a 
byte array be divided up in reverse order to what I'd expect on 
some machines? If it is... I don't mind writing separate versions 
depending on endianness with 
version(BigEndian)/version(LittleEndian) to get around a runtime 
check... I'm just unsure of how endianness factors into the order 
of an array...

Jan 16 2016

Johannes Pfau <nospam example.com> writes:

Am Sat, 16 Jan 2016 18:05:46 +0000
schrieb Samson Smith <fsdf dsfd.com>:

 On Saturday, 16 January 2016 at 16:28:21 UTC, Jonathan M Davis 
 wrote:
 But it will be less error-prone to use those functions, and if 
 you _do_ actually need to swap endianness, then they're exactly 
 what you should be using. We've had cases that have come up 
 where using those functions prevented bugs precisely because 
 the person writing the code got the sizes wrong (and the 
 compiler complained, since nativeToBigEndian and friends deal 
 with the sizes in a typesafe manner).

 - Jonathan M Davis  

 
 If I'm hoping to have my hash come out the same on both bigendian 
 and littleendian machines but not send the results between 
 machines, should I take these precautions? I want one machine to 
 send the other a seed (in an endian safe way) and have both 
 machines generate the same hashes.
 
 Here's the relevant code:
 
 uint coordHash(int x, int y, uint seed){
 	seed = FNV1a((cast(ubyte*) &x)[0 .. x.sizeof], seed);
 	return FNV1a((cast(ubyte*) &y)[0 .. y.sizeof], seed);
 }
 // Byte order matters for the below function
 uint FNV1a(ubyte[] bytes, uint code){
 	for(int iii = 0; iii < bytes.length; ++iii){
          	code ^= bytes[iii];
          	code *= FNV_PRIME_32;
 	}
 	return code;
 }
 
 Am I going to get the same outcome on all machines or would a 
 byte array be divided up in reverse order to what I'd expect on 
 some machines? If it is... I don't mind writing separate versions 
 depending on endianness with 
 version(BigEndian)/version(LittleEndian) to get around a runtime 
 check... I'm just unsure of how endianness factors into the order 
 of an array...

If you use the simple pointer cast you will end up with different byte
orders on little vs big endian machines. Endianness does not affect
array order in general:

ubyte[] myArray = [1, 2, 3, 4];
myArray[0] == 1, myArray[1] == 2, ...


This is the same on big vs little endian machines. Endianness does
affect the representation of (multi-byte)numbers:

int a = 42;
ubyte[4] b = *cast(ubyte[4])&a;

This will generate [42, 0, 0, 0] on little endian, [0, 0, 0, 42] on big
endian.


So if you want the same byte output for all architectures, just choose
either big or little endian (which one doesn't matter). Then convert
the values on the other architecture (e.g. if you choose little endian,
do nothing on little endian, swap bytes on big endian).


TLDR; Just use nativeToBigEndian or nativeToLittleEndian from
std.bitmanip, these functions do the right thing. These functions do not
use runtime checks, they use version(Big/LittleEndian)
internally. nativeToBigEndian does not do anything on big endian
machines, nativeToLittleEndian doesn't do anything on little endian
machines.

Jan 16 2016

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Convert some ints into a byte array without allocations?