www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How do I choose the correct primative?

reply "Jake Thomas" <jake fake.com> writes:
First, let me say that I am <i>extremely</i> enthused about D. I 
did research on it last year for a project and absolutely fell in 
love with it. But praise should go in another thread...


My question comes down to:
"Does dmd pack non-array primative variables in memory such that 
they are touching, or are they zero-padded out to the computer's 
native word size?"

I have a fun "little" project I work on when I have time (for 
which D is redicuosly perfect, BTW), and right now I am "merely" 
listing the prototypes of functions that will comprise its API.


On my first go-through of the function protypes, I thoughtfully 
figured out the smallest primatives I could safely use for inputs 
and outputs. Obviously, when it comes to programming, I'm a 
little OCD - who cares about memory to that degree anymore when 
we have gigabytes of RAM? This might not even come into play on 
the Raspberry Pi.

I also figured that choosing a safe minimum would make the code 
more self-commented by queing the reader into what the expected 
value range for the variable is.

Then I took Architecture & Assembly class. There I learned that 
the load instruction grabs an entire native word size, every 
time, regardless of how many bits your variable takes up.

When we programmed in assembly in that class, for both 
performance and coding ease, we only worked with variables that 
were the native code size.


I found out that it's actually extra work for the processor to 
use values smaller than the native word size: it has to AND off 
the unwanted bits and possibly shift them over.


So, if dmd packs variables together, I would want to always use 
the native word size to avoid that extra work, and I would never 
want to use shorts, ints, or longs. Instead, I'd want to do this:
<code>
version (X86)
{
   alias int native; //ints are 32-bit, the native size in this 
case.
   alias uint unative;
}

version (X86_64)
{
   alias long native; //longs are 64-bit, the native size in this 
case.
   alias ulong unative;
}
</code>

And then only use natives, unatives, and booleans (can't avoid 
them) for my primatives.

I really hope this isn't the case because it would make D's 
entire primative system pointless. In acedamia, C is often 
scolded for its ints always being the native word size, while 
Java is praised for being consistent from platform to platform. 
But if dmd packs its variables, D is the one that should be 
scolded and C is the one that should be praised for the same 
reason of the opposite.


If, however, dmd always zero-pads its variables so that each load 
instruction only grabs the desired value with no need of extra 
work, I would never have to worry about whether my variable is 
the native word size.

However, this knowledge would still affect my programming:

If I know my code will only ever be compiled for 32-bit machines 
and up, I should never use shorts. Doing so would always waste at 
least 16-bits per short. Even if I think I will never overflow a 
short, why not just take the whole 32 bits; they're allocated for 
the variable anyways; not using those bits would be wasteful.

Also, if I know I don't need anymore than 32 bits for a variable, 
I should use an int, never a long. That way, the processor does 
not have to do extra work on a 32-bit machine or a 64-bit machine 
or any higher bitage. If I always default to longs like a "good 
acedemically trained computer scientist fighting crusades against 
hard caps", 32-bit machines (and 64-bit machines still running 
32-bit OSes!!!) would have to do extra work to work on 64-bit 
values split across two native words.

And lastly, if I absolutely must have more than 32-bits for a 
single value, I have no choice but to use a long.


So, I need to have this question answered to even get past the 
function prototype stage - each answer would result in different 
code.

Thank you very much,
I love D,
Jake
Dec 31 2013
next sibling parent reply "Rikki Cattermole" <alphaglosined gmail.com> writes:
On Wednesday, 1 January 2014 at 04:17:30 UTC, Jake Thomas wrote:
 First, let me say that I am <i>extremely</i> enthused about D. 
 I did research on it last year for a project and absolutely 
 fell in love with it. But praise should go in another thread...


 My question comes down to:
 "Does dmd pack non-array primative variables in memory such 
 that they are touching, or are they zero-padded out to the 
 computer's native word size?"

 I have a fun "little" project I work on when I have time (for 
 which D is redicuosly perfect, BTW), and right now I am 
 "merely" listing the prototypes of functions that will comprise 
 its API.


 On my first go-through of the function protypes, I thoughtfully 
 figured out the smallest primatives I could safely use for 
 inputs and outputs. Obviously, when it comes to programming, 
 I'm a little OCD - who cares about memory to that degree 
 anymore when we have gigabytes of RAM? This might not even come 
 into play on the Raspberry Pi.

 I also figured that choosing a safe minimum would make the code 
 more self-commented by queing the reader into what the expected 
 value range for the variable is.

 Then I took Architecture & Assembly class. There I learned that 
 the load instruction grabs an entire native word size, every 
 time, regardless of how many bits your variable takes up.

 When we programmed in assembly in that class, for both 
 performance and coding ease, we only worked with variables that 
 were the native code size.


 I found out that it's actually extra work for the processor to 
 use values smaller than the native word size: it has to AND off 
 the unwanted bits and possibly shift them over.


 So, if dmd packs variables together, I would want to always use 
 the native word size to avoid that extra work, and I would 
 never want to use shorts, ints, or longs. Instead, I'd want to 
 do this:
 <code>
 version (X86)
 {
   alias int native; //ints are 32-bit, the native size in this 
 case.
   alias uint unative;
 }

 version (X86_64)
 {
   alias long native; //longs are 64-bit, the native size in 
 this case.
   alias ulong unative;
 }
 </code>

 And then only use natives, unatives, and booleans (can't avoid 
 them) for my primatives.

 I really hope this isn't the case because it would make D's 
 entire primative system pointless. In acedamia, C is often 
 scolded for its ints always being the native word size, while 
 Java is praised for being consistent from platform to platform. 
 But if dmd packs its variables, D is the one that should be 
 scolded and C is the one that should be praised for the same 
 reason of the opposite.


 If, however, dmd always zero-pads its variables so that each 
 load instruction only grabs the desired value with no need of 
 extra work, I would never have to worry about whether my 
 variable is the native word size.

 However, this knowledge would still affect my programming:

 If I know my code will only ever be compiled for 32-bit 
 machines and up, I should never use shorts. Doing so would 
 always waste at least 16-bits per short. Even if I think I will 
 never overflow a short, why not just take the whole 32 bits; 
 they're allocated for the variable anyways; not using those 
 bits would be wasteful.

 Also, if I know I don't need anymore than 32 bits for a 
 variable, I should use an int, never a long. That way, the 
 processor does not have to do extra work on a 32-bit machine or 
 a 64-bit machine or any higher bitage. If I always default to 
 longs like a "good acedemically trained computer scientist 
 fighting crusades against hard caps", 32-bit machines (and 
 64-bit machines still running 32-bit OSes!!!) would have to do 
 extra work to work on 64-bit values split across two native 
 words.

 And lastly, if I absolutely must have more than 32-bits for a 
 single value, I have no choice but to use a long.


 So, I need to have this question answered to even get past the 
 function prototype stage - each answer would result in 
 different code.

 Thank you very much,
 I love D,
 Jake
We have size_t defined as uint on 32bit and ulong on 64bit. ptrdiff_t for int/long. I don't know how dmd handles it, although you do have the ability to align variables. You may want to consider gdc or ldc more than dmd as they have better optimization.
Dec 31 2013
parent reply "Jake Thomas" <jake fake.com> writes:
 We have size_t defined as uint on 32bit and ulong on 64bit. 
 ptrdiff_t for int/long.
 I don't know how dmd handles it, although you do have the 
 ability to align variables.
 You may want to consider gdc or ldc more than dmd as they have 
 better optimization.
Sorry for the delayed response - distractions. Thanks for the tips! I didn't know about aligning or ptrdiff_t. I looked into and experimented with "align". At least with dmd, it seems to only work in structs. Looking at http://dlang.org/attribute.html, one might conclude that this is a bug in dmd - the attributes web page doesn't say anywhere that align should be limited to work only within structs. No syntactical error is thrown by the compiler if align is used outside a struct, but the symantics aren't there to back it up. Watch this: [code] import std.stdio; int main() { align(8) int testInt1; align(8) int testInt2; align(8) int testInt3; int* testInt1Locale = &testInt1; int* testInt2Locale = &testInt2; int* testInt3Locale = &testInt3; long testLong1; long testLong2; long testLong3; long* testLong1Locale = &testLong1; long* testLong2Locale = &testLong2; long* testLong3Locale = &testLong3; struct AlignTest { align (8): int intOne; int intTwo; int intThree; } AlignTest alignTest; int* alignTestIntOneLocale = &alignTest.intOne; int* alignTestIntTwoLocale = &alignTest.intTwo; int* alignTestIntThreeLocale = &alignTest.intThree; writeln(cast(long)testInt2Locale - cast(long)testInt1Locale); writeln(cast(long)testInt3Locale - cast(long)testInt2Locale); writeln(cast(long)testLong2Locale - cast(long)testLong1Locale); writeln(cast(long)testLong3Locale - cast(long)testLong2Locale); writeln(cast(long)alignTestIntTwoLocale - cast(long)alignTestIntOneLocale); writeln(cast(long)alignTestIntThreeLocale - cast(long)alignTestIntTwoLocale); return 0; } [/code] The above code, compiled with dmd for and on 64-bit linux prints the following: [quote] 4 //Ints 1 & 2 are 4 bytes apart, dispite specifying them to be 8 bytes apart. 4 //Ints 2 & 3 are 4 bytes apart, dispite specifying them to be 8 bytes apart. 8 //Longs 1 & 2 are 8 bytes apart. 8 //Longs 2 & 3 are 8 bytes apart. 8 //Struct's ints 1 & 2 are 8 bytes apart - specified by align attribute. 8 //Struct's ints 2 & 3 are 8 bytes apart - specified by align attribute. [/quote] So, short-term, it seems like one would want to use my "native/unative" technique. But long-term, hopefully not only does this get fixed, but the default behavior for the compiler be to pad things out to the native word size without having to specify alignment. Note: I tried "auto" - they come out as ints on my 64-bit architecture, still separated by 4 bytes. Even with size_t and ptrdiff_t, I would still want to make my own aliases "native" and "unative" because size_t is intended for array indexes (from what I saw when I briefly read about them) and "native" and "unative" sound more generic and readable - to me at least - and follows the "u" naming convention. Stay efficeint, Jake
Jan 03 2014
parent reply "TheFlyingFiddle" <kurtyan student.chalmers.se> writes:
On Saturday, 4 January 2014 at 03:14:35 UTC, Jake Thomas wrote:
 So, short-term, it seems like one would want to use my 
 "native/unative" technique. But long-term, hopefully not only 
 does this get fixed, but the default behavior for the compiler 
 be to pad things out to the native word size without having to 
 specify alignment.
According to this (http://msdn.microsoft.com/en-us/library/windows/hardware/ff5 1499(v=vs.85).aspx) 32-bit registers are automatically zero extended on x64 architecture while 16-bit and 8-bit registers are not. So 32-bit integers are fast. And since they are smaller then 64-bits they should benefit from higher cache-locality. So you might actually slow things down by using your (u)native sizes. Andrei has a nice talk about optimizations that talks about this a bit (http://shelby.tv/video/vimeo/55639112/facebook-nyc-tech-talk-andrei-alexandrescu-three-optimization-tips-for-c)
How do you folks decide if a number should be an int, long, or 
even short?
It depends what the number is for. In general i stick to (u)int or size_t but when types have some special significance for example the red color channel or a network port i use the corresponding types to signal bound ranges on the values. That being said i try to minimize the size of my structs as much as possible. (Keep in mind that when doing this you order the fields by alignment size) I tend to use ushorts as indices when i know that an array will never be larger then ushort.max. But this is only if i use 2 of em. There is no point in having 1 short if the struct alignment is 4 since that will only be wasted space anyways.
I guess the exact type of variables should remain up in the air 
until the whole thing is implemented and tested using different 
types?
Yes very much so.
Jan 04 2014
parent reply "Jake Thomas" <jake fake.com> writes:
 According to this 
 (http://msdn.microsoft.com/en-us/library/windows/hardware/ff5
1499(v=vs.85).aspx) 
 32-bit registers are automatically zero extended on x64 
 architecture while 16-bit and 8-bit registers are not.
"Operations that output to a 32-bit subregister are automatically zero-extended to the entire 64-bit register. Operations that output to 8-bit or 16-bit subregisters are not zero-extended (this is compatible x86 behavior)." Hmmm. Sounds like 32-bit compatibilty mode stuff. I'm concerned that a 64-bit binary outputted by dmd would contain no 32-bit mode instructions, therefore, although a load instuction exists that would automatically zero-pad, I'm not sure dmd uses it. The file command (in Linux) tells me that my outputted binary is a 64-bit binary, not a 32-bit/64-bit hybrid binary. Then again, maybe it's not programmed to check for hybrid binaries and just says "64-bit" if it is 64-bit overall. To try and check for myself, I ran the binary through objdump. I found the "LEA" instruction. I don't know if it's a 32-bit load instruction or a 64-bit load instruction. But _wow_ was I shocked at how many lines of assembly were generated! I tried the following: int main() { int loadMe = void; return loadMe; } And got 86,421 lines of assembly!! I expected a load instruction to load whatever was at loadMe's location into r0 (the return register) and not much else. Maybe 10 lines - tops - due to compiler fluffiness. I got about 8,641 times that - over 3 more orders of magnatude. What is going on here?
I guess the exact type of variables should remain up in the air 
until the whole thing is implemented and tested using different 
types?
Yes very much so.
Ah ha. Thanks for the tip. I see that this makes one's effort to lay down an API/library/module specification a little interesting. Can't publish your chickens until they hatch.
Jan 04 2014
parent reply "TheFlyingFiddle" <kurtyan student.chalmers.se> writes:
On Sunday, 5 January 2014 at 06:31:38 UTC, Jake Thomas wrote:
 And got 86,421 lines of assembly!! I expected a load 
 instruction to load whatever was at loadMe's location into r0 
 (the return register) and not much else. Maybe 10 lines - tops 
 - due to compiler fluffiness. I got about 8,641 times that - 
 over 3 more orders of magnatude. What is going on here?
Well the compiler pulls in at minimum the entire D runtime if i'm not mistaken which make the standard .exe about 350kb. Things like Object.factory also pulls in it's fair share due to not being able to remove classes. So we get alot of fluff in small programs. The module layout of the standard library is also a problem, there is a lot of interconnections between the different modules in Phobos. (will hopefully be better when the modules are broken down into submodules) I tested your test program on windows x64 and got the following result: mov ebp,esp sub rsp,10h mov eax,dword ptr [rbp-8] lea rsp,[rbp] pop rbp ret //This does a 32 bit load into the eax register (return register) from the //stack. mov eax,dword ptr [rbp-8] //I also ran this to see if there was any difference int main() { int loadMe = 10; return loadMe; } --Resulting main functtion mov ebp,esp sub rsp,10h mov eax,0Ah mov dword ptr [rbp-8],eax lea rsp,[rbp] pop rbp ret //Loads 10 the value to be returned and //Then stores that value on the stack. //While this is not rly necessary i ran //the code in debug mode so it does not //remove most useless instructions. mov eax,0Ah mov dword ptr [rbp-8],eax //In optimized mode it is gone push rbp mov rbp,rsp mov eax,0Ah pop rbp ret So it looks like dmd does 32-bit loads at least on windows x64.
Jan 05 2014
parent reply "Jake Thomas" <jake fake.com> writes:
On Sunday, 5 January 2014 at 08:23:45 UTC, TheFlyingFiddle wrote:
 On Sunday, 5 January 2014 at 06:31:38 UTC, Jake Thomas wrote:
 And got 86,421 lines of assembly!! I expected a load 
 instruction to load whatever was at loadMe's location into r0 
 (the return register) and not much else. Maybe 10 lines - tops 
 - due to compiler fluffiness. I got about 8,641 times that - 
 over 3 more orders of magnatude. What is going on here?
Well the compiler pulls in at minimum the entire D runtime if i'm not mistaken which make the standard .exe about 350kb.
Ah. Thank you for the explaination.
 Things like Object.factory also pulls in it's fair share due to 
 not being able to remove classes. So we get alot of fluff in 
 small programs.
What do you mean by not being able to remove classes? Isn't the whole point of offering a language that has both structs, which can have functions, and classes to do away with classes when inheritence isn't needed?
 The module layout of the standard library is also a problem, 
 there is a lot of interconnections between the different 
 modules in Phobos. (will hopefully be better when the modules 
 are broken down into submodules)
I'm a big fan of 99% of D's specification, perhaps less a fan of its current implementation. But implementations hopefully change over time for the better. The hope is to one day simply re-compile the same source with a better compiler.
 I tested your test program on windows x64 and got the following 
 result:

 mov         ebp,esp
 sub         rsp,10h
 mov         eax,dword ptr [rbp-8]
 lea         rsp,[rbp]
 pop         rbp
 ret

 //This does a 32 bit load into the eax register (return 
 //register) from the stack.
 mov         eax,dword ptr [rbp-8]
What tools and parameters did you use to obtain that dissassembly? I did not find "dword" anywhere in the dissassembly of my test program. The last place I found eax used was this line: 27: 2e 33 00 xor %cs:(%rax),%eax Jake
Jan 06 2014
next sibling parent reply "Jake Thomas" <jake fake.com> writes:
Ok, I figured out how to use obj2asm. The trick is to cd to the 
directory holding the file you wish to dissassemble and _not_ 
specify the whole path, or else it throws a confusing "Fatal 
error: unrecognized flag" error.

I ran:

obj2asm intLoadTest.o intLoadTest.d > intLoadTest.s

and got this:


FLAT	group	
	extrn	_main
	public	_deh_beg
	public	_deh_end
	public	_tlsstart
	public	_tlsend
	public	_Dmain
	public	_D11intLoadTest12__ModuleInfoZ
	public	main
	extrn	_d_run_main
	extrn	_Dmain
	extrn	_d_dso_registry
.text	segment
	assume	CS:.text
.text	ends
.data	segment
_D11intLoadTest12__ModuleInfoZ:
	db	004h,010h,000h,000h,000h,000h,000h,000h	;........
	db	069h,06eh,074h,04ch,06fh,061h,064h,054h	;intLoadT
	db	065h,073h,074h,000h	;est.
.data	ends
.bss	segment
.bss	ends
.rodata	segment
.rodata	ends
.tdata	segment
_tlsstart:
	db	000h,000h,000h,000h,000h,000h,000h,000h	;........
	db	000h,000h,000h,000h,000h,000h,000h,000h	;........
.tdata	ends
.tdata.	segment
.tdata.	ends
.text._Dmain	segment
	assume	CS:.text._Dmain
_Dmain:
		push	RBP
		mov	RBP,RSP
		mov	EAX,0Ah
		pop	RBP
		ret
		0f1f
		add	[RAX],R8B
.text._Dmain	ends
.text.main	segment
	assume	CS:.text.main
main:
		push	RBP
		mov	RBP,RSP
		sub	RSP,010h
		mov	RDX,offset FLAT:_Dmain 64
		call	  _d_run_main PC32
		leave
		ret
.text.main	ends
.data.d_dso_rec	segment
	db	000h,000h,000h,000h,000h,000h,000h,000h	;........
.data.d_dso_rec	ends
.text.d_dso_init	segment
	assume	CS:.text.d_dso_init
L0:		enter	0,0
		lea	RAX,_deh_end PC32[RIP]
		push	RAX
		lea	RAX,_deh_beg PC32[RIP]
		push	RAX
		lea	RAX,FLAT:[00h][RIP]
		push	RAX
		lea	RAX,FLAT:[00h][RIP]
		push	RAX
		lea	RAX,FLAT:.data.d_dso_rec[00h][RIP]
		push	RAX
		push	1
		mov	RDI,RSP
		call	  _d_dso_registry PLT32
		leave
		ret
.text.d_dso_init	ends
	end

Can you tell whether a 32-bit load was used?

Jake

P.S - That's _way_ less output than what objdump gave!
Jan 06 2014
parent "Jake Thomas" <jake fake.com> writes:
Oh, and that was made from:

int main()
{
   int loadMe = 10;
   return loadMe;
}
Jan 06 2014
prev sibling parent reply "TheFlyingFiddle" <theflyingfiddle gmail.com> writes:
On Monday, 6 January 2014 at 20:08:27 UTC, Jake Thomas wrote:
 Things like Object.factory also pulls in it's fair share due 
 to not being able to remove classes. So we get alot of fluff 
 in small programs.
What do you mean by not being able to remove classes? Isn't the whole point of offering a language that has both structs, which can have functions, and classes to do away with classes when inheritence isn't needed?
Well since you could potentially create classes through Object.factory at runtime the code for unused classes will be compiled into the binary anyways this is even if you never use Object.factory directly in the code. I am not 100% sure but i think the main problem is ModuleInfo that keeps everything alive. And it keeps classes alive since they could be used by object factory. It also keeps other information like unittests locations and static constructors.
 What tools and parameters did you use to obtain that 
 dissassembly?
I used the visual studio dissassembly window.
 Can you tell whether a 32-bit load was used?
_Dmain: push RBP mov RBP,RSP mov EAX,0Ah pop RBP ret ---- mov EAX,0AH This is a 32-bit instruction. 64-bit instructions use the RAX register. It's actually the same register but it's just named diffrently depending if you use the full 64-bits or just the lower 32-bits. It will automatically zero extend it. See https://github.com/yasm/yasm/wiki/AMD64 for a simple intro into x64.
Jan 06 2014
parent "Jake Thomas" <jake fake.com> writes:
 Well since you could potentially create classes through 
 Object.factory at runtime the code for unused classes will be 
 compiled into the binary anyways this is even if you never use 
 Object.factory directly in the code. I am not 100% sure but i 
 think the main problem is ModuleInfo that keeps everything 
 alive. And it keeps classes alive since they could be used by 
 object factory. It also keeps other information like unittests 
 locations and static constructors.
Well then. I hope that changes for the better. It should be able to see that I'm not using the object factory or anything. Then again, the language has certain exceptions it is capable of throwing, which themselves are objects. I wonder if the garbage collector must retain the ability to throw an exception and thus retain the need of classes.
 What tools and parameters did you use to obtain that 
 dissassembly?
I used the visual studio dissassembly window.
Thanks for the tip. Always nice to know about an assortment of tools.
 Can you tell whether a 32-bit load was used?
_Dmain: push RBP mov RBP,RSP mov EAX,0Ah pop RBP ret ---- mov EAX,0AH This is a 32-bit instruction. 64-bit instructions use the RAX register. It's actually the same register but it's just named diffrently depending if you use the full 64-bits or just the lower 32-bits. It will automatically zero extend it. See https://github.com/yasm/yasm/wiki/AMD64 for a simple intro into x64.
Excellent! We successfully proved that it does use 32-bit load instructions in a 64-bit binary, both for Linux and Windows! Good to know about RAX/EAX, thanks - I was only familiar with ARM assembly. There is CISC in this world, apparently. For the full experience, I disassembled the binary from the following: /* I always do int mains - except when trying to simplify assembly as much as possible for educating myself about instructions the compiler outputs. I never even ran this binary, I only made it to look at its disassembly. */ void main() { longLoadTest(); } long longLoadTest() { long loadMe = 10; return loadMe; } Sure enough, I saw something very similar to what you pointed out, but using the RAX name instead of the EAX name (for longLoadTest's return). Thank you very much, Jake
Jan 06 2014
prev sibling next sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
C compilers like D compilers will pack a struct of two 16-bit
words into a 32-bit type if you don't force an alignment:
http://dlang.org/attribute.html#align
What you should avoid is having a data type start at an
address that is not a multiple of its size, especially when it
comes to SIMD.
Working with 16-bit values is not really supported in todays
x86 CPUs though, and integer math in D typically yields ints
even when you use smaller data types, reflecting what happens
on the hardware. Usually I use uint, size_t, real for things
that will go to CPU registers and the smallest data type that
will work for storage in memory.
Keep in mind that RAM access is slow compared to how fast CPUs
run. It can be beneficial to have "slower" data types if they
allow more data to fit into the CPU cache.

Typically you sort the fields of a struct by size with the
larger ones (e.g. pointers) at the top followed by ints,
shorts and finally bytes if you want to conserve memory. There
is even a template to do that for you, but I think it is more
of a toy, when you can easily do that manually without the
clutter:
http://dlang.org/phobos/std_typecons.html#.alignForSize
Jan 01 2014
parent "Jake Thomas" <jake fake.com> writes:
Keep in mind that RAM access is slow compared to how fast CPUs
run. It can be beneficial to have "slower" data types if they
allow more data to fit into the CPU cache.
Abosolutely fantastic point, Marco! Except if everything still fits in cache as "fast" types, it'd be worth having faster types. How do you folks decide if a number should be an int, long, or even short? I guess the exact type of variables should remain up in the air until the whole thing is implemented and tested using different types? Wow, this is an active forum, I must say. Much thanks! Jake
Jan 03 2014
prev sibling next sibling parent "TheFlyingFiddle" <kurtyan student.chalmers.se> writes:
 I'm a little OCD - who cares about memory to that degree 
 anymore when we have gigabytes of RAM? This might not even come 
 into play on the Raspberry Pi.
Memory is very important when it comes to performance, the moving of memory is the single most energy demanding task the CPU (and the GPU for that matter) has. See (http://channel9.msdn.com/Events/Build/2013/4-329 and http://media.medfarm.uu.se/play/video/3261 for why this is so) Anyhow if you want to get a good understanding about how memory works and is related to performance i would highly recommend reading this entire PDF http://www.akkadia.org/drepper/cpumemory.pdf
Then I took Architecture & Assembly class. There I learned that 
the load instruction grabs an entire native word size, every 
time, regardless of how many bits your variable takes up.
When we programmed in assembly in that class, for both 
performance and coding ease, we only worked with variables that 
were the native code size.
Keep in mind that schools are usually 5-15 years behind current technology (especially introductory classes) and what was true then of performance does not have to be true today.
I found out that it's actually extra work for the processor to 
use values smaller than the native word size: it has to AND off 
the unwanted bits and possibly shift them over.
So, if dmd packs variables together, I would want to always use 
the native word size to avoid that extra work, and I would 
never want to use shorts, ints, or longs. Instead, I'd want to 
do this:
This "extra work" is highly unlikely to be your bottleneck. Also don't assume that making everything be of native size is going to make things faster just because of an AND/SHIFT instruction.
And then only use natives, unatives, and booleans (can't avoid 
them) for my primatives.
You can use normal integers in if statements if you want. Anything not 0 will be true. int flag = someFunc(); if(flag) { //Do something. }
I really hope this isn't the case because it would make D's 
entire primative system pointless. In acedamia, C is often 
scolded for its ints always being the native word size, while 
Java is praised for being consistent from platform to platform. 
But if dmd packs its variables, D is the one that should be 
scolded and C is the one that should be praised for the same 
reason of the opposite.
D is like java. (u)byte is 8bit (u)short is 16bit (u)int is 32bit (u)long is 64bit size_t and pttdiff_t are of platform size. Note that floats are always calculated by the systems highest precision. float 32bit double 64bit real platform dependant but never lower then 64bit. //So, I need to have this question answered to even get past the //function prototype stage - each answer would result in different //code. You should not care about this in the function prototype stage this is the essence of premature optimization. You cant be sure of how fast something is going to be before you profile it. And even then it will vary from run to run / computer to computer and compiler to compiler. I think this book gives a nice introduction to the subject. http://carlos.bueno.org/optimization/mature-optimization.pdf
Jan 01 2014
prev sibling parent reply "Casper =?UTF-8?B?RsOmcmdlbWFuZCI=?= <shorttail gmail.com> writes:
On Wednesday, 1 January 2014 at 04:17:30 UTC, Jake Thomas wrote:
 snip
Are you looking for something like int_fast32_t and the likes from Boost? If you don't care terribly much for when your numbers overflow, then as others suggested, size_t and pttwhatever work fine.
Jan 02 2014
parent "Jake Thomas" <jake fake.com> writes:
On Friday, 3 January 2014 at 05:25:49 UTC, Casper Færgemand wrote:
 On Wednesday, 1 January 2014 at 04:17:30 UTC, Jake Thomas wrote:
 snip
Are you looking for something like int_fast32_t and the likes from Boost? If you don't care terribly much for when your numbers overflow, then as others suggested, size_t and pttwhatever work fine.
I had never heard of int_fast32_t before, but after Googling and finding out what it is, yes, I am looking for just exactly that.
Jan 03 2014