www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - 64-bit opportunity?

reply Bill Cox <Bill_member pathlink.com> writes:
Hi, guys.

It's good to see D doing well, and several people contributing for multiple
years now.

I think I've brought this up before, but now that vitually all the new CPUs are
natively 64-bit, and OSs are going that way, too (Windows, Linux and Mac OS), I
thought it might be time for more discussion.

The problem
-----------
Simply converting all pointers to 64-bits is a poor solution.  Anyone doing this
already has a program that uses a whole lot of memory, or they would do it.  In
my experience, these programs typically fill up most of their memory with
pointers rather than other kinds of data.  These programs are also typically
very speed sensitive.  What happens when we convert these programs is ugly:

- The user has to buy something like 80% more memory just to complete the same
tasks he could handle in 32-bit mode
- The programs slow down (I've seen around 20% typically) due to worse cache
performance
- Since the users expected the 64-bit version to run faster and make better use
of memory, users have very bad reactions to 64-bit programs

Given that these programs are typically both memory and speed sensitive,
converting to 64 bits is a real problem.  Many EDA companies are facing this
problem today.

An opportunity?
---------------
I think D could use another eye-opening feature to get more interest.  Both
64-bit machines and D are new, so there may be a way to ride the 64-bit wave.
Also, C# and C++ are not abstract enough to offer good 64-bit performance, so
this could be a good differentiator for D.

A truely abstract language (like D) doesn't need to conform to C style memory
layout.  For example, object references can be integers used to index into
arrays of properties, rather than true pointers.  This allows any class with
less than 4 billion elements to be accessed with 32-bit references.  Classes
with fewer than 64K elements can be accessed with 16-bit references.  The only
user-visiable extension to the language would be an optional reference size
specifier (default would be 32-bit).

While such support wouldn't be trivial to implement, I think it's doable.  In
particular, start by separating the object heap into a heap per class.  This is
a good idea in any case, since it allows the heaps to contain constant sized
objects, and helps cache performance.  Then, separate each heap into a heap per
field, rather than per class.  This way, only one data field is in each heap.
Then, instead of using pointers as object references, an index from the start of
the heap can be used.

A nice performance hack would be to pack fields together into 64-bit chunks,
since that is the typical width of a DRAM memory bus.  Certain fields tend to be
accessed together in critical inner loops, and packing them together could
really speed up the cache.  DSP programmers often use this sort of trick
manually to speed up critical computations.

How cool would it be to have 64-bit code not only use the same exact ammount of
memory as their 32-bit versions, but to actually run faster?  As I've stated
before, we currently do some of this stuff at ViASIC, and our programs to use
the exact same memory footprint, and run slightly faster in 64-bit mode.  It's
entirely doable.

Death to 32-bit mode!

Bill
Nov 28 2004
next sibling parent "Walter" <newshound digitalmars.com> writes:
I was thinking about doing 64 bit mode, but matching the C 64 bit memory
model. Your idea of using ptr+offset indexing for classes instead is
certainly intriguing!
Nov 28 2004
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
Interesting idea.  So a D reference, then, would contain a heap number and an
offset?  And assuming I want to convert a D reference to a C pointer then the
conversion routine would return a 32 or 64-bit pointer, as appropriate?  I
wonder what kind of impact this scheme would have on garbage collection--it
seems like it could potentially offer a real speed increase.


Sean
Nov 29 2004
parent Bill Cox <Bill_member pathlink.com> writes:
In article <cog0oq$1htm$1 digitaldaemon.com>, Sean Kelly says...
Interesting idea.  So a D reference, then, would contain a heap number and an
offset?  And assuming I want to convert a D reference to a C pointer then the
conversion routine would return a 32 or 64-bit pointer, as appropriate?  I
wonder what kind of impact this scheme would have on garbage collection--it
seems like it could potentially offer a real speed increase.


Sean

Hi, Sean. I wouldn't include the heap number in the D reference, just the index. Since D is stronly typed, the compiler can know which heap is being accessed. However, that does significantly change how inheritance works. The way I implemented inheritance with indexed based references is less efficient than standard C++. In particular, when I create an object of a derived class, first I create an object of the base class, then I create an object of the derived class, and then I cross-couple pointers between them. The fields of the base class stored on the base class object, and not duplicated. Also, I store a type field on the base class that says what type of derived object is being pointed to. This doesn't impact speed much, since conversions back and forth between base classes and derived classes is pretty rare in the code I've seen. However, it added two data fields per class, and if you get deep inheritance trees, those pointers can add up to a lot of memory. In EDA coding, I find that we tend to use a small amount of static inheritance, but a lot of dynamic class extension. What I call dynamic class extension is where we have objects that are already in the main database, and a tool that is using the main database wants to add some fields to them. In this case, all you have to do is allocate some additional heaps for the additional fields when the tool runs, and clean them up when the tool finishes. In C++, we'd have to allocate objects of a local class, and use cross-pointers between them and the objects in the database. Basically, you get one for free, and the other requires the cross-pointers. In EDA, we get more bang for the buck out of efficient dynamic class extension than efficient casting between class types. Bill
Nov 30 2004
prev sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Bill Cox" <Bill_member pathlink.com> wrote in message
news:coc9cd$1ntq$1 digitaldaemon.com...
 Hi, guys.

 It's good to see D doing well, and several people contributing for

 years now.

 I think I've brought this up before, but now that vitually all the new

 natively 64-bit, and OSs are going that way, too (Windows, Linux and Mac

 thought it might be time for more discussion.

 The problem
 -----------
 Simply converting all pointers to 64-bits is a poor solution.  Anyone

 already has a program that uses a whole lot of memory, or they would do

 my experience, these programs typically fill up most of their memory with
 pointers rather than other kinds of data.  These programs are also

 very speed sensitive.  What happens when we convert these programs is

 - The user has to buy something like 80% more memory just to complete the

 tasks he could handle in 32-bit mode
 - The programs slow down (I've seen around 20% typically) due to worse

 performance
 - Since the users expected the 64-bit version to run faster and make

 of memory, users have very bad reactions to 64-bit programs

 Given that these programs are typically both memory and speed sensitive,
 converting to 64 bits is a real problem.  Many EDA companies are facing

 problem today.

In my experience with MATLAB users 64-bit addressing is used for processing large data sets. These data sets are mostly made up of non-pointers like telemetry data or images or video data. So I would guess the best performance would come from splitting the memory model up into two chunks - 32-bit pointers and 64-bit pointers. You put your large data set into the 64-bit chunk and the rest of the app into the 32-bit chunk. 32-bit pointers would always have to point to the "small" memory chunk and 64-bit pointers would point to the "large" memory chunk. A separate API for the 64-bit chunk (eg, malloc64, realloc64, free64) would be required and probably language support for differentiating between the pointer types (eg what is the return type of malloc64?). It would be fun to experiment with, though. -Ben
Nov 29 2004
parent reply "Simon Buchan" <currently no.where> writes:
On Mon, 29 Nov 2004 16:05:37 -0500, Ben Hinkle <bhinkle mathworks.com>  
wrote:

<snip>
 In my experience with MATLAB users 64-bit addressing is used for  
 processing
 large data sets. These data sets are mostly made up of non-pointers like
 telemetry data or images or video data. So I would guess the best
 performance would come from splitting the memory model up into two  
 chunks -
 32-bit pointers and 64-bit pointers. You put your large data set into the
 64-bit chunk and the rest of the app into the 32-bit chunk. 32-bit  
 pointers
 would always have to point to the "small" memory chunk and 64-bit  
 pointers
 would point to the "large" memory chunk. A separate API for the 64-bit  
 chunk
 (eg, malloc64, realloc64, free64) would be required and probably language
 support for differentiating between the pointer types (eg what is the  
 return
 type of malloc64?). It would be fun to experiment with, though.

 -Ben

That sounds like edge cases... yucky. What if you have meta-data with those data files, would the meta-data be part of the 64-bit data sets or mirrored on the 32-bit, what happens if you forget one, etc... 'Course, I'm not an expert, but, I think if performance would be gained by doing this, it should be behind the scenes. (And I personally dont like *64 names :P) -- "Unhappy Microsoft customers have a funny way of becoming Linux, Salesforce.com and Oracle customers." - www.microsoft-watch.com: "The Year in Review: Microsoft Opens Up"
Nov 29 2004
parent "Ben Hinkle" <bhinkle mathworks.com> writes:
"Simon Buchan" <currently no.where> wrote in message
news:opsh9khdahjccy7t simon.homenet...
 On Mon, 29 Nov 2004 16:05:37 -0500, Ben Hinkle <bhinkle mathworks.com>
 wrote:

 <snip>
 In my experience with MATLAB users 64-bit addressing is used for
 processing
 large data sets. These data sets are mostly made up of non-pointers like
 telemetry data or images or video data. So I would guess the best
 performance would come from splitting the memory model up into two
 chunks -
 32-bit pointers and 64-bit pointers.


[snip]
 That sounds like edge cases... yucky.

It's not the edge case in engineering applications like those used by Boeing, Ford, NASA... etc etc I don't deal with databases so I don't know what kind of datatypes and layouts they use but MATLAB deals with lots and lots of numbers and number-crunching. That's what our customers need 64-bits for.
 What if you have meta-data
 with those data files, would the meta-data be part of the 64-bit
 data sets or mirrored on the 32-bit, what happens if you forget
 one, etc...

Meta-data is usually orders of magnitude smaller than the data itself and so should go into the 32-bit part. But that depends on the application. I'm not sure what you mean by forgetting one.
 'Course, I'm not an expert, but, I think if performance would be
 gained by doing this, it should be behind the scenes.
 (And I personally dont like *64 names :P)

My suggestion is largely academic since I think the 64-bit OSes don't allow mixed pointer types within an application. They either use 32-bit backward-compatibility mode or full 64-bit mode. You can't say "start me up in 32-bit mode but allow me to access 64-bit space". At least you can't for now :-)
Nov 30 2004