www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - byte and short data types use cases

reply Murloc <justanotheraccount3212334 proton.me> writes:
Hi, I was interested why, for example, `byte` and `short` 
literals do not have their own unique suffixes (like `L` for 
`long` or `u` for `unsigned int` literals) and found the 
following explanation:

- "I guess short literal is not supported solely due to the fact 
that anything less than `int` will be "promoted" to `int` during 
evaluation. `int` has the most natural size. This is called 
integer promotion in C++."

Which raised another question: since objects of types smaller 
than `int` are promoted to `int` to use integer arithmetic on 
them anyway, is there any point in using anything of integer type 
less than `int` other than to limit the range of values that can 
be assigned to a variable at compile time? Are these data types 
there because of some historical reasons (maybe `byte` and/or 
`short` were "natural" for some architectures before)?

People say that there is no advantage for using `byte`/`short` 
type for integer objects over an int for a single variable, 
however, as they say, this is not true for arrays, where you can 
save some memory space by using `byte`/`short` instead of `int`. 
But isn't any further manipulations with these array objects will 
produce results of type `int` anyway? Don't you have to cast 
these objects over and over again after manipulating them to 
write them back into that array or for some other manipulations 
with these smaller types objects? Or is this only useful if 
you're storing some array of constants for reading purposes?

Some people say that these promoting and casting operations in 
summary may have an even slower overall effect than simply using 
int, so I'm kind of confused about the use cases of these data 
types... (I think that my misunderstanding comes from not knowing 
how things happen at a slightly lower level of abstractions, like 
which operations require memory allocation, which do not, etc. 
Maybe some resource recommendations on that?) Thanks!
Jun 09 2023
next sibling parent reply Cecil Ward <cecil cecilward.com> writes:
On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:
 Hi, I was interested why, for example, `byte` and `short` 
 literals do not have their own unique suffixes (like `L` for 
 `long` or `u` for `unsigned int` literals) and found the 
 following explanation:

 - "I guess short literal is not supported solely due to the 
 fact that anything less than `int` will be "promoted" to `int` 
 during evaluation. `int` has the most natural size. This is 
 called integer promotion in C++."

 Which raised another question: since objects of types smaller 
 than `int` are promoted to `int` to use integer arithmetic on 
 them anyway, is there any point in using anything of integer 
 type less than `int` other than to limit the range of values 
 that can be assigned to a variable at compile time? Are these 
 data types there because of some historical reasons (maybe 
 `byte` and/or `short` were "natural" for some architectures 
 before)?

 People say that there is no advantage for using `byte`/`short` 
 type for integer objects over an int for a single variable, 
 however, as they say, this is not true for arrays, where you 
 can save some memory space by using `byte`/`short` instead of 
 `int`. But isn't any further manipulations with these array 
 objects will produce results of type `int` anyway? Don't you 
 have to cast these objects over and over again after 
 manipulating them to write them back into that array or for 
 some other manipulations with these smaller types objects? Or 
 is this only useful if you're storing some array of constants 
 for reading purposes?

 Some people say that these promoting and casting operations in 
 summary may have an even slower overall effect than simply 
 using int, so I'm kind of confused about the use cases of these 
 data types... (I think that my misunderstanding comes from not 
 knowing how things happen at a slightly lower level of 
 abstractions, like which operations require memory allocation, 
 which do not, etc. Maybe some resource recommendations on 
 that?) Thanks!
For me there are two use cases for using byte and short, ubyte and ushort. The first is simply to save memory in a large array or neatly fit into a ‘hole’ in a struct, say next to a bool which is also a byte. If you have four ubyte variables in a struct and then an array of them, then you are getting optimal memory usage. In the x86 for example the casting operations for ubyte to uint use instructions that have zero added cost compared to a normal uint fetch. And casting to a ubyte generates no code at all. So the costs of casting in total are zero. The second use-case is where you need to interface to external specifications that deman uint8_t (ubyte), or uint16_t (ushort) where I am using the standard definitions from std.stdint. These types are the in C. If you are interfacing to externally defined struct in data structures in ram or in messages, that’s one example. The second example is where you need to interface to machine code that has registers or operands of 8-bit or 16-bit types. I like to use the stdint types for the purposes of documentation as it rams home the point that these are truly fixed width types and can not change. (And I do know that in D, unlike C, int, long etc are of defined fixed widths. Since C doesn’t have those guarantees that’s why the C stdint.h is needed in C too.) As well as machine code, we could add other high-level languages where interfaces are defined in the other language and you have to hope that the other language’s type widths don’t change.
Jun 09 2023
parent reply Murloc <justanotheraccount3212334 proton.me> writes:
On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
 On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

 If you have four ubyte variables in a struct and then
 an array of them, then you are getting optimal memory usage.
Is this some kind of property? Where can I read more about this? So you can optimize memory usage by using arrays of things smaller than `int` if these are enough for your purposes, but what about using these instead of single variables, for example as an iterator in a loop, if range of such a data type is enough for me? Is there any advantages on doing that?
Jun 09 2023
next sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
 On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
 On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

 If you have four ubyte variables in a struct and then
 an array of them, then you are getting optimal memory usage.
Is this some kind of property? Where can I read more about this?
Yes, a classsic resource is http://www.catb.org/esr/structure-packing/
 So you can optimize memory usage by using arrays of things 
 smaller than `int` if these are enough for your purposes,
It's not for arrays, it's also for members ```d struct S1 { ubyte a; // offs 0 ulong b; // offs 8 ubyte c; // offs 16 } struct S2 { ubyte a; // offs 0 ubyte c; // offs 1 ulong b; // offs 8 } static assert(S1.sizeof > S2.sizeof); // 24 VS 16 ``` this is because you cant do unaligned reads for `b`, but you can for `a` and `c`.
 but what about using these instead of single variables, for 
 example as an iterator in a loop, if range of such a data type 
 is enough for me? Is there any advantages on doing that?
Not really the loop variable takes a marginal part of the stack space in the current function. You can just use `auto` and let the compiler choose the best type.
Jun 09 2023
parent Salih Dincer <salihdb hotmail.com> writes:
On Friday, 9 June 2023 at 23:51:07 UTC, Basile B. wrote:
 Yes, a classsic resource is 
 http://www.catb.org/esr/structure-packing/

 So you can optimize memory usage by using arrays of things 
 smaller than `int` if these are enough for your purposes,
So, is the sorting correct in a structure like the one below with partial overlap? ```d struct DATA { union { ulong bits; ubyte[size] cell; } enum size = 5; bool last; alias last this; size_t length, limit, index = ulong.sizeof; bool empty() { return index / ulong.sizeof >= limit; } ubyte[] data; ubyte front() { //.. ``` This code snippet is from an actual working my project. What is done is to process 40 bits of data. SDB 79
Jun 10 2023
prev sibling next sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/9/23 08:07, Murloc wrote:

 Where can I read more about this?
I had written something related: http://ddili.org/ders/d.en/memory.html#ix_memory..offsetof The .offsetof appears at that point. The printObjectLayout() function example there attempts to visualize the layout of the members of a struct. Ali
Jun 09 2023
prev sibling next sibling parent Cecil Ward <cecil cecilward.com> writes:
On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
 On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
 On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

 If you have four ubyte variables in a struct and then
 an array of them, then you are getting optimal memory usage.
Is this some kind of property? Where can I read more about this? So you can optimize memory usage by using arrays of things smaller than `int` if these are enough for your purposes, but what about using these instead of single variables, for example as an iterator in a loop, if range of such a data type is enough for me? Is there any advantages on doing that?
Read up on ‘structs’ and the ‘align’ attribute in the main d docs, on this website. Using smaller fields in a struct that is in memory saves RAM if there is an array of such structs. Even in the case where there is only one struct, let’s say that you are returning a struct by value from some function. If the struct is fairly small in total, and the compiler is good (ldc or gdc, not dmd - see godbolt.org) then the returned struct can fit into a register sometimes, rather than being placed in RAM, when it is returned to the function’s caller. Yesterday I returned a struct containing four uint32_t fields from a function and it came back to the caller in two 64-bit registers, not in RAM. Clearly using smaller fields if possible might make it possible for the whole struct to be under the size limit for being returned in registers. As for your question about single variables. The answer is very definitely no. Rather, the opposite: always use primary CPU-‘natural’ types, widths that are most natural to the processor in question. 64-bit cpus will sometimes favour 32-bit types an example being x86-64/AMD64, where code handling 32-bit ints generates less code (saves bytes in the code segment) but the speed and number of instructions is the same on such a 64-bit processor where you’re dealing with 32- or 64- bit types. Always use size_t for index variables into arrays or the size of anything in bytes, never int or uint. On a 64-bit machine such as x86-64, size_t is 64-bit, not 32. By using int/uint when you should have used size_t you could in theory get a very rare bug when dealing with eg file sizes or vast amounts of (virtual) memory, say bigger than 2GB (int limit) or 4GB (uint limit) when the 32-bit types overflow. There is also a ptrdiff_t which is 64-bit on a 64-bit cpu, probably not worth bothering with as its raison d’être was historical (early 80s 80286 segmented architecture, before the 32-bit 386 blew it away).
Jun 09 2023
prev sibling parent reply Cecil Ward <cecil cecilward.com> writes:
On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
 On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
 On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

 If you have four ubyte variables in a struct and then
 an array of them, then you are getting optimal memory usage.
Is this some kind of property? Where can I read more about this? So you can optimize memory usage by using arrays of things smaller than `int` if these are enough for your purposes, but what about using these instead of single variables, for example as an iterator in a loop, if range of such a data type is enough for me? Is there any advantages on doing that?
A couple of other important use-cases came to me. The first one is unicode which has three main representations, utf-8 which is a stream of bytes each character can be several bytes, utf-16 where a character can be one or rarely two 16-bit words, and utf32 - a stream of 32-bit words, one per character. The simplicity of the latter is a huge deal in speed efficiency, but utf32 takes up almost four times as memory as utf-8 for western european languages like english or french. The four-to-one ratio means that the processor has to pull in four times the amount of memory so that’s a slowdown, but on the other hand it is processing the same amount of characters whichever way you look at it, and in utf8 the cpu is having to parse more bytes than characters unless the text is entirely ASCII-like. The second use-case is about SIMD. Intel and AMD x86 machines have vector arithmetic units that are either 16, 32 or 64 bytes wide depending on how recent the model is. Taking for example a post-2013 Intel Haswell CPU, which has 32-byte wide units, if you choose smaller width data types you can fit more in the vector unit - that’s how it works, and fitting in more integers or floating point numbers of half width means that you can process twice as many in one instruction. On our Haswell that means four doubles or four quad words, or eight 32-bit floats or 32-bit uint32_ts, and similar doubling s’s for uint16_t. So here width economy directly relates to double speed.
Jun 10 2023
next sibling parent Cecil Ward <cecil cecilward.com> writes:
On Saturday, 10 June 2023 at 21:58:12 UTC, Cecil Ward wrote:
 On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
 On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
 [...]
Is this some kind of property? Where can I read more about this?
My last example is comms. Protocol headers need economical narrow data types because of efficiency, it’s all about packing as much user data as possible into each packet and fatter, longer headers reduce the amount of user data as the total has a hard limit on it. A pair of headers totalling 40 bytes in IPv4+TCP takes up nearly 3% of the total length allowed, so that’s a ~3% speed loss, as the headers are just dead weight. So here narrow types help comms speed.
Jun 10 2023
prev sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Sat, Jun 10, 2023 at 09:58:12PM +0000, Cecil Ward via Digitalmars-d-learn
wrote:
 On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
[...]
 So you can optimize memory usage by using arrays of things smaller
 than `int` if these are enough for your purposes, but what about
 using these instead of single variables, for example as an iterator
 in a loop, if range of such a data type is enough for me? Is there
 any advantages on doing that?
A couple of other important use-cases came to me. The first one is unicode which has three main representations, utf-8 which is a stream of bytes each character can be several bytes, utf-16 where a character can be one or rarely two 16-bit words, and utf32 - a stream of 32-bit words, one per character. The simplicity of the latter is a huge deal in speed efficiency, but utf32 takes up almost four times as memory as utf-8 for western european languages like english or french. The four-to-one ratio means that the processor has to pull in four times the amount of memory so that’s a slowdown, but on the other hand it is processing the same amount of characters whichever way you look at it, and in utf8 the cpu is having to parse more bytes than characters unless the text is entirely ASCII-like.
[...] On contemporary machines, the CPU is so fast that memory access is a much bigger bottleneck than processing speed. So unless an operation is being run hundreds of thousands of times, you're not likely to notice the difference. OTOH, accessing memory is slow (that's why the memory cache hierarchy exists). So utf8 is actually advantageous here: it fits in a smaller space, so it's faster to fetch from memory; more of it can fit in the CPU cache, so less DRAM roundtrips are needed. Which is faster. Yes you need extra processing because of the variable-width encoding, but it happens mostly inside the CPU, which is fast enough that it generally outstrips the memory roundtrip overhead. So unless you're doing something *really* complex with the utf8 data, it's an overall win in terms of performance. The CPU gets to do what it's good at -- running complex code -- and the memory cache gets to do what it's good at: minimizing the amount of slow DRAM roundtrips. T -- It said to install Windows 2000 or better, so I installed Linux instead.
Jun 10 2023
parent Cecil Ward <cecil cecilward.com> writes:
On Sunday, 11 June 2023 at 00:05:52 UTC, H. S. Teoh wrote:
 On Sat, Jun 10, 2023 at 09:58:12PM +0000, Cecil Ward via 
 Digitalmars-d-learn wrote:
 On Friday, 9 June 2023 at 15:07:54 UTC,
 [...]

 On contemporary machines, the CPU is so fast that memory access 
 is a much bigger bottleneck than processing speed. So unless an 
 operation is being run hundreds of thousands of times, you're 
 not likely to notice the difference. OTOH, accessing memory is 
 slow (that's why the memory cache hierarchy exists). So utf8 is 
 actually advantageous here: it fits in a smaller space, so it's 
 faster to fetch from memory; more of it can fit in the CPU 
 cache, so less DRAM roundtrips are needed. Which is faster.  
 Yes you need extra processing because of the variable-width 
 encoding, but it happens mostly inside the CPU, which is fast 
 enough that it generally outstrips the memory roundtrip 
 overhead. So unless you're doing something *really* complex 
 with the utf8 data, it's an overall win in terms of 
 performance. The CPU gets to do what it's good at -- running 
 complex code -- and the memory cache gets to do what it's good 
 at: minimizing the amount of slow DRAM roundtrips.
I completely agree with H. S. Teoh. That is exactly what I was going to say. The point is that considerations like this have to be thought through carefully and width of types really does matter in the cases brought up. But outside these cases, as I said earlier, stick to uint, size_t and ulong, or uint32_t and uint64_t if exact size is vital, but do also check out the other std.stdint types too as very occasionally they are needed.
Jun 10 2023
prev sibling parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Fri, Jun 09, 2023 at 11:24:38AM +0000, Murloc via Digitalmars-d-learn wrote:
[...]
 Which raised another question: since objects of types smaller than
 `int` are promoted to `int` to use integer arithmetic on them anyway,
 is there any point in using anything of integer type less than `int`
 other than to limit the range of values that can be assigned to a
 variable at compile time?
Not just at compile time, at runtime they will also be fixed to that width (mapped to a hardware register of that size) and will not be able to contain a larger value. [...]
 People say that there is no advantage for using `byte`/`short` type
 for integer objects over an int for a single variable, however, as
 they say, this is not true for arrays, where you can save some memory
 space by using `byte`/`short` instead of `int`.
That's correct.
 But isn't any further manipulations with these array objects will
 produce results of type `int` anyway? Don't you have to cast these
 objects over and over again after manipulating them to write them back
 into that array or for some other manipulations with these smaller
 types objects?
Yes you will have to cast them back. Casting often translates to a no-op or just a single instruction in the machine code; you just write part of a 32-bit register back to memory instead of the whole thing, and this automatically truncates the value to the narrow int. The general advice is, perform computations with int or wider, then truncate when writing back to storage for storage efficiency. So generally you wouldn't cast the value to short/byte until the very end when you're about to store the final result back to the array. At that point you'd probably also want to do a range check to catch any potential overflows.
 Some people say that these promoting and casting operations in summary
 may have an even slower overall effect than simply using int, so I'm
 kind of confused about the use cases of these data types... (I think
 that my misunderstanding comes from not knowing how things happen at a
 slightly lower level of abstractions, like which operations require
 memory allocation, which do not, etc. Maybe some resource
 recommendations on that?) Thanks!
I highly recommend taking an introductory course to assembly language, or finding a book / online tutorial on the subject. Understanding how the machine actually works under the hood will help answer a lot of these questions, even if you'll never actually write a single line of assembly code. But in a nutshell: integer data types do not allocate, unless you explicitly ask for it (e.g. `int* p = new int;` -- but you almost never want to do this). They are held in machine registers or stored on the runtime stack, and always occupy a fixed size, so almost no memory management is needed for them. (Which is also why they're preferred when you don't need anything more fancy, because they're also super-fast.) Promoting an int takes at most 1 machine instruction, or, in the case of unsigned values, sometimes zero instructions. Casting back to a narrow int is often a no-op (the subsequent code just ignores the upper bits). The performance difference is negligible, unless you're doing expensive things like range checking after every operation (generally you don't need to anyway, usually it's sufficient to check range at the end of a computation, not at every intermediate step -- unless you have reason to believe that an intermediate step is liable to overflow or wrap around). T -- People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth
Jun 09 2023