digitalmars.D - Dual CPU code
- bearophile <bearophileHUGS lycos.com> Feb 02 2009
- Don <nospam nospam.com> Feb 02 2009
- bearophile <bearophileHUGS lycos.com> Feb 02 2009
- Walter Bright <newshound1 digitalmars.com> Feb 02 2009
- bearophile <bearophileHUGS lycos.com> Feb 02 2009
- Walter Bright <newshound1 digitalmars.com> Feb 02 2009
- grauzone <none example.net> Feb 02 2009
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Feb 02 2009
- Walter Bright <newshound1 digitalmars.com> Feb 02 2009
- BCS <none anon.com> Feb 02 2009
- Christopher Wright <dhasenan gmail.com> Feb 02 2009
- BCS <ao pathlink.com> Feb 02 2009
- "Tim M" <a b.com> Feb 02 2009
This comes after a small discussion I've had in the #D IRC channel. I have seen that the LDC compiler is much more efficient if you use SSE(2) extensions, while it's not much efficient if you don't use them (GCC/GDC don't seem so much sensitive to the presence of the SSE extensions). I often have to switch from an old and a new CPU, so if I compile with SSE2 extensions the program doesn't run on the old CPU, while if I don't use them, I sometimes have a program that goes much slower on the newer CPU. So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable. Notice that the problem I have shown isn't limited to SSE2, it's more common, for example in the close future you may want code compiled for the GPU and/or CPU, etc. Bye, bearophile
Feb 02 2009
bearophile wrote:This comes after a small discussion I've had in the #D IRC channel. I have seen that the LDC compiler is much more efficient if you use SSE(2) extensions, while it's not much efficient if you don't use them (GCC/GDC don't seem so much sensitive to the presence of the SSE extensions). I often have to switch from an old and a new CPU, so if I compile with SSE2 extensions the program doesn't run on the old CPU, while if I don't use them, I sometimes have a program that goes much slower on the newer CPU. So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable. Notice that the problem I have shown isn't limited to SSE2, it's more common, for example in the close future you may want code compiled for the GPU and/or CPU, etc. Bye, bearophile
Is this mostly integer, or floating point code?
Feb 02 2009
Don:Is this mostly integer, or floating point code?
In that specific cases, it's mostly FP. If I compile it with LDC with -sse3 flags the resulting asm is a jungle of the new registers :-) Bye, bearophile
Feb 02 2009
bearophile wrote:So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable.
This is a very old problem, it even cropped up in the bad old DOS days where you had the choice of emulator or FPU. The solution is fairly simple - you don't need to bind together two executables. Simply put a runtime switch in: import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo(); and then compile sse.d and nosse.d with different compiler switches. The std.cpuid module will tell you what you've got at runtime. To see a real example of this, look at the array op implementation code in the standard library, such as internal/arrayfloat.d, it does a runtime switch for several different FPU flavors.
Feb 02 2009
Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();
I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd")); Bye, bearophile
Feb 02 2009
bearophile wrote:Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();
I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
That's one way to do it.
Feb 02 2009
Walter Bright wrote:bearophile wrote:Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();
I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
That's one way to do it.
The glorious return of include files!
Feb 02 2009
Walter Bright wrote:bearophile wrote:Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();
I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
That's one way to do it.
I must be missing something - why isn't import shared_module_code; good? Andrei
Feb 02 2009
Andrei Alexandrescu wrote:Walter Bright wrote:bearophile wrote:Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();
I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
That's one way to do it.
I must be missing something - why isn't import shared_module_code; good?
Because importing something does not change how it was compiled. If you have one module that you want two separate instances of, compiled with different switches, they have to be somehow given different names.
Feb 02 2009
Hello Andrei,bearophile wrote:module sse; mixin(import("shared_module_code.dd"));
I must be missing something - why isn't import shared_module_code; good? Andrei
the code generator needs to be run on the code more than once.
Feb 02 2009
Andrei Alexandrescu wrote:Walter Bright wrote:bearophile wrote:Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();
I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
That's one way to do it.
I must be missing something - why isn't import shared_module_code; good? Andrei
The shared code has to be compiled with two sets of compiler switches, resulting in two distinct modules with different ModuleInfo, TypeInfo, and so forth. You can't do that with import.
Feb 02 2009
Reply to bearophile,Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();
(maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd")); Bye, bearophile
my first thought would be to play games with the linker: define a function EnterA() that calls code define a function EnterB() that calls code compile needed code for CPU A to A.obj Compile needed code for CPU B to B.obj make a lib with EnterA and A.obj forcing internal linking make a lib with EnterB and B.obj forcing internal linking link common code and both libs making the libs becomes the fun part
Feb 02 2009
On Tue, 03 Feb 2009 00:31:17 +1300, bearophile <bearophileHUGS lycos.com> wrote:This comes after a small discussion I've had in the #D IRC channel. I have seen that the LDC compiler is much more efficient if you use SSE(2) extensions, while it's not much efficient if you don't use them (GCC/GDC don't seem so much sensitive to the presence of the SSE extensions). I often have to switch from an old and a new CPU, so if I compile with SSE2 extensions the program doesn't run on the old CPU, while if I don't use them, I sometimes have a program that goes much slower on the newer CPU. So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable. Notice that the problem I have shown isn't limited to SSE2, it's more common, for example in the close future you may want code compiled for the GPU and/or CPU, etc. Bye, bearophile
Is this the sort thing you are looking for: http://www.songho.ca/misc/sse/sse.html
Feb 02 2009









bearophile <bearophileHUGS lycos.com> 