www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - CT Information about target CPU and Related cross-compile

reply Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
Hi all,

I will write std.blas and it will be heavily optimised for LDC. 
Can these features be added to LDC?

1. Basic compile time information about target CPU such as 
L1/L2/L3 cache sizes and available instructions set, e.g. SSE2, 
AVX, AVX2, AVX512.

2. Related cross-compile. For example: target is x86_64; AVX 
support can be checked at runtime using core.cpuid; so I want to 
force LDC to compile three versions of BLAS for SSE, AVX and 
AVX512, and choose better in runtime.

Links:
std.blas annonce: 
http://forum.dlang.org/thread/nilhvnqbsgqhxdshpqfl forum.dlang.org
Dec 26 2015
next sibling parent reply Johan Engelen <j j.nl> writes:
On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko 
wrote:
 Hi all,

 I will write std.blas and it will be heavily optimised for LDC.
jay! :-)
 Can these features be added to LDC?

 1. Basic compile time information about target CPU such as 
 L1/L2/L3 cache sizes and available instructions set, e.g. SSE2, 
 AVX, AVX2, AVX512.
Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline?
 2. Related cross-compile. For example: target is x86_64; AVX 
 support can be checked at runtime using core.cpuid; so I want 
 to force LDC to compile three versions of BLAS for SSE, AVX and 
 AVX512, and choose better in runtime.
Something like this? https://gcc.gnu.org/wiki/FunctionMultiVersioning
Dec 27 2015
next sibling parent Johan Engelen <j j.nl> writes:
 On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko 
 wrote:
 Hi all,

 2. Related cross-compile. For example: target is x86_64; AVX 
 support can be checked at runtime using core.cpuid; so I want 
 to force LDC to compile three versions of BLAS for SSE, AVX 
 and AVX512, and choose better in runtime.
An LLVM presentation I found on the topic: http://llvm.org/devmtg/2014-10/Slides/Christopher-Function%20Multiversioning%20Talk.pdf (perhaps mostly a reminder to self ;)
Dec 27 2015
prev sibling parent reply Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
On Sunday, 27 December 2015 at 17:34:26 UTC, Johan Engelen wrote:
 On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko 
 wrote:
 Hi all,

 I will write std.blas and it will be heavily optimised for LDC.
jay! :-)
 Can these features be added to LDC?

 1. Basic compile time information about target CPU such as 
 L1/L2/L3 cache sizes and available instructions set, e.g. 
 SSE2, AVX, AVX2, AVX512.
Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline?
I have found that core.cpuid can provide runtime information about cache sizes, it is enough. However amount of SIMD registers and their sizes should be known at compile time. What do you mean with "set of function names / version IDs"?
 2. Related cross-compile. For example: target is x86_64; AVX 
 support can be checked at runtime using core.cpuid; so I want 
 to force LDC to compile three versions of BLAS for SSE, AVX 
 and AVX512, and choose better in runtime.
Something like this? https://gcc.gnu.org/wiki/FunctionMultiVersioning
Yes! Or runtime check at least. Ilya
Dec 27 2015
parent reply Johan Engelen <j j.nl> writes:
On Sunday, 27 December 2015 at 23:47:41 UTC, Ilya Yaroshenko 
wrote:
 On Sunday, 27 December 2015 at 17:34:26 UTC, Johan Engelen 
 wrote:
 On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko 
 wrote:

 Can these features be added to LDC?

 1. Basic compile time information about target CPU such as 
 L1/L2/L3 cache sizes and available instructions set, e.g. 
 SSE2, AVX, AVX2, AVX512.
Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline?
I have found that core.cpuid can provide runtime information about cache sizes, it is enough. However amount of SIMD registers and their sizes should be known at compile time. What do you mean with "set of function names / version IDs"?
(I am pretty new to D, etc.) Can you give me a sample of code showing what "API" you expect for this stuff?
 2. Related cross-compile. For example: target is x86_64; AVX 
 support can be checked at runtime using core.cpuid; so I want 
 to force LDC to compile three versions of BLAS for SSE, AVX 
 and AVX512, and choose better in runtime.
Something like this? https://gcc.gnu.org/wiki/FunctionMultiVersioning
Yes! Or runtime check at least.
I had been thinking about implementing function multiversioning before. It's great that someone wants it :-)
Dec 30 2015
parent reply Ilya <ilyayaroshenko gmail.com> writes:
On Wednesday, 30 December 2015 at 15:20:35 UTC, Johan Engelen 
wrote:
 On Sunday, 27 December 2015 at 23:47:41 UTC, Ilya Yaroshenko 
 wrote:
 On Sunday, 27 December 2015 at 17:34:26 UTC, Johan Engelen 
 wrote:
 On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya 
 Yaroshenko wrote:

 Can these features be added to LDC?

 1. Basic compile time information about target CPU such as 
 L1/L2/L3 cache sizes and available instructions set, e.g. 
 SSE2, AVX, AVX2, AVX512.
Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline?
I have found that core.cpuid can provide runtime information about cache sizes, it is enough. However amount of SIMD registers and their sizes should be known at compile time. What do you mean with "set of function names / version IDs"?
(I am pretty new to D, etc.) Can you give me a sample of code showing what "API" you expect for this stuff?
Dispatching example: target("default") //used for ctfe code int foo () { // The default version of foo. return 0; } target("sse4.2") int foo() { // foo version for SSE4.2 if compiler is LDC return 1; } target("arch=atom,+sse2") int foo() { // foo version for the Intel ATOM processor with SSE2 suport return 2; } Compile time features example: version(LDC) { enum bool a = __target(has, "avx2"); enum bool b = __target(compatible, "core-avx2"); enum bool c = __target("broadwell"); } else version(GNU) { ... }
 2. Related cross-compile. For example: target is x86_64; AVX 
 support can be checked at runtime using core.cpuid; so I 
 want to force LDC to compile three versions of BLAS for SSE, 
 AVX and AVX512, and choose better in runtime.
Something like this? https://gcc.gnu.org/wiki/FunctionMultiVersioning
Yes! Or runtime check at least.
I had been thinking about implementing function multiversioning before. It's great that someone wants it :-)
Dec 30 2015
parent reply JohanEngelen <j j.nl> writes:
On Wednesday, 30 December 2015 at 20:07:02 UTC, Ilya wrote:
  target("sse4.2")
 int foo() {
  // foo version for SSE4.2 if compiler is LDC
  return 1;
 }
I'm working on (a rudimentary version of) target at the moment. I assume you build LDC yourself and you are happy to help with some testing and give feedback? :) cheers, Johan
Jan 02
parent reply Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
On Saturday, 2 January 2016 at 23:27:16 UTC, JohanEngelen wrote:
 On Wednesday, 30 December 2015 at 20:07:02 UTC, Ilya wrote:
  target("sse4.2")
 int foo() {
  // foo version for SSE4.2 if compiler is LDC
  return 1;
 }
I'm working on (a rudimentary version of) target at the moment. I assume you build LDC yourself and you are happy to help with some testing and give feedback? :) cheers, Johan
Yes! You can count on me ;) --Ilya
Jan 02
parent reply JohanEngelen <j j.nl> writes:
On Sunday, 3 January 2016 at 05:16:36 UTC, Ilya Yaroshenko wrote:
 On Saturday, 2 January 2016 at 23:27:16 UTC, JohanEngelen wrote:
 I'm working on (a rudimentary version of)  target at the 
 moment.
 I assume you build LDC yourself and you are happy to help with 
 some testing and give feedback? :)

 cheers,
   Johan
Yes! You can count on me ;) --Ilya
Great, thanks :) The branch is ready: https://github.com/JohanEngelen/ldc/tree/attr_target (make sure git correctly fetches the druntime branch with ldc.attributes.target in it) Usage examples can be found in the test file: tests/ir/attr_target_x86.d It'd be great if you can run the IR tests (and can help improve the tests): cd tests/ir python runlit.py -v . I myself often modify a test file locally and rerun the test to quickly see if things are working or not (inspect output .ll and .s). cheers, Johan
Jan 03
parent Johan Engelen <j j.nl> writes:
On Sunday, 3 January 2016 at 13:11:55 UTC, JohanEngelen wrote:
 The branch is ready:
See: https://github.com/ldc-developers/ldc/pull/1244
Jan 03
prev sibling parent Johan Engelen <j j.nl> writes:
On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko 
wrote:
 2. Related cross-compile. For example: target is x86_64; AVX 
 support can be checked at runtime using core.cpuid; so I want 
 to force LDC to compile three versions of BLAS for SSE, AVX and 
 AVX512, and choose better in runtime.
I think we could also implement this as a library solution, instead of compiler-internally. Would that make more sense?
Jan 04