www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Cpu instructions exposed

reply Cecil Ward <d cecilward.com> writes:
I have written a few zero-overhead (fully inlining) D wrappers 
around certain new x64 instructions as an exercise to help me 
learn D and get used to GDC asm. I've also written D replacements 
for older processors. They are templated functions with 
customised variants supporting a variety of different word-widths.

1. Would anyone find these useful? Bet I'm inventing the wheel? 
(But still a good learning task for me.)

2. How best to get them reviewed for correct D-style and

3. how to package them up, expose them? They need to be usable by 
the caller in such a was as they get fully directly inlined with 
no subroutine calls or arg passing adaptation overhead so as to 
get the desired full 100% performance. For example a call with a 
literal constant argument should continue to mean an immediate 
operand in the generated code, which happens nicely currently in 
my testbeds. So I don't know, the user needs to see the lib fn 
_source_ or some equivalent GDC cleverness. (Like entire thing in 
a .h file. Yes, I know, I know. :-) )

4. I would like to do the same for LDC, unfortunately the asm 
system is rather different from GDC. I don't know if there is 
anything clever I can do to try to avoid duplication of effort / 
totally split sources and double maintenance? (Desperation? 
Preprocess the D sources with an external tool if all else fails! 
Yuck. Don't have one at hand right now anyway.)

Is there any way I could get D to actually generate some D code 
to help with that?

I have seen some pretty mind-blowing stuff in D using mixin or 
something - looks fantastic, just like the power of our old 
friends the evil unconstrained C macros that can generate random 
garbage C source text without limit, but in D it's done right so 
the D source can actually be parsed properly, no two languages 
fighting. I recall using this kind of source generation for 
dealing with lots of different operator-overloading routines that 
all follow a similar pattern. Can't think where else. I don't 
know what is available and what the limits of various techniques 
are. I'm wondering if I could get D to internally generate 
GDC-specific or LDC-specific source code strings - the two asm 
frameworks are syntactically different iirc - starting from a 
friendly generic neutral format, transforming it somehow. (If 
memory serves, I think GDC uses a non-D extended syntax, very 
close to asm seen in GCC for C, for easier partial re-use of 
snippets from C sources. On the other hand LDC looks more like 
standard D with complex template expansion, but I haven't studied 
it properly.)

Any general tips to point me in the right direction, much 
appreciated.
Aug 28
parent rikki cattermole <rikki cattermole.co.nz> writes:
On 29/08/2017 2:49 AM, Cecil Ward wrote:
 I have written a few zero-overhead (fully inlining) D wrappers around 
 certain new x64 instructions as an exercise to help me learn D and get 
 used to GDC asm. I've also written D replacements for older processors. 
 They are templated functions with customised variants supporting a 
 variety of different word-widths.
 
 1. Would anyone find these useful? Bet I'm inventing the wheel? (But 
 still a good learning task for me.)
Sure, why not?
 2. How best to get them reviewed for correct D-style and
Lets talk about 3.
 3. how to package them up, expose them? They need to be usable by the 
 caller in such a was as they get fully directly inlined with no 
 subroutine calls or arg passing adaptation overhead so as to get the 
 desired full 100% performance. For example a call with a literal 
 constant argument should continue to mean an immediate operand in the 
 generated code, which happens nicely currently in my testbeds. So I 
 don't know, the user needs to see the lib fn _source_ or some equivalent 
 GDC cleverness. (Like entire thing in a .h file. Yes, I know, I know. :-) )
Dub + force -inline. Also you will need to support ldc and dmd.
 4. I would like to do the same for LDC, unfortunately the asm system is 
 rather different from GDC. I don't know if there is anything clever I 
 can do to try to avoid duplication of effort / totally split sources and 
 double maintenance? (Desperation? Preprocess the D sources with an 
 external tool if all else fails! Yuck. Don't have one at hand right now 
 anyway.)
Duplicate. Nothing wrong with that for such little code. Its already abstracted nicely out. As long as you are using function arguments as part of you iasm, push/pop registers you use, it should be inlined correctly as per the arguments. But I'd like to see some code before making any other remarks. I highly suggest you hang out on IRC (#d Freenode) to help get interactive reviews+suggestions.
Aug 28