www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Inline assembler in D and LDC, round 2

reply Tomas Lindquist Olsen <tomas.l.olsen gmail.com> writes:
Hello everybody.

Here comes the second round of inline asm discussion related to LDC,
the LLVM D Compiler.
Last time was about naked inline asm and the problems it poses for a
backend like LLVM.
Since revision 920 in our mercurial tree, naked inline asm support is
good enough that Don's Bigint code from Tango now works. This is a
great step forward...

I implemented it by using a feature in LLVM that allows you to insert
raw assembly at the codeunit level, and modified the asm processor to
support generating that as well. It wasn't that big a job really,
isn't completely finished yet, and still needs a lot of testing of
course ;)

Now Christian Kamm also finished the last ABI / calling-convention
bits we were missing on x86-32 Linux. This naturally lead us to try
out defining the "controversial" D_InlineAsm_X86 version identifier...

Now in Tango there's a bunch of code, like the following (copied from
tango.math.IEEE.d)

real ldexp(real n, int exp) /* intrinsic */
{
    version(Really_D_InlineAsm_X86)
    {
        asm {
            fild exp;
            fld n;
            fscale;
            fstp ST(1), ST(0);
        }
    }
    else
    {
        return tango.stdc.math.ldexpl(n, exp);
    }
}

This code assumes that the value of ST(0) is preserved after the asm
block ends and that the compiler simply inserts a return instruction
as appropriate.
This doesn't work with LLVM. For the function to be valid in codegen,
we must insert a return instruction in the LLVM IR code after the
block, and the only choices we have for the value to return is an
undefined value. This kind of code usually works when the program
isn't optimized, however, if optimization is enabled, a caller of
ldexp will most likely notice that the return value is undefined or a
constant, and so has a lot of freedom to do what it wants. Breaking
the way the return value is received in the process.

This is almost exactly the same problem I had with naked inline asm,
and the only fix is to somehow generate an inline asm expression
(that's what llvm has, not statements like D), that produces the right
return value. Something a bit like:

return asm { ... }

Since D has no way to express this directly, it means we would have to
analyze the inline asm and somehow capture the right registers etc.
This is not something I want to implement right now, if ever...

The LLVM people are not interested in adding some kind of feature to
allow this, since the inline asm expressions already suffice for
normal GCC (which has inline asm expressions) C/C++ code.

Now the real question is, does this code even have well defined
semantics in terms of the D spec? and if not, could we possibly
specify it as implementation specific behaviour.

Everything is in place to specify the D_InlineAsm_X86 version
identifier in LDC, but a lot of asm still isn't going to work, due to
reasons like this.

I hope to hear some feedback on how to move on from here.

Thank you all,
Tomas Lindquist Olsen and the LDC Team.
Feb 04 2009
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Is the inline assembling actually done by the LLVM back end, or the LDC 
front end?
Feb 04 2009
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Walter Bright wrote:
 Is the inline assembling actually done by the LLVM back end, or the LDC 
 front end?

The frontend turns it into an GCC-style asm statement (with explicit input and output constraints) that shows up as a function literal in the IR (only valid as target of a 'call' instruction). The LLVM codegen then uses those constraints to allocate registers, substitutes them in the asm string and emits it directly to the assembler as part of the output[1]. (LLVM, like GCC, normally uses an external assembler) [1]: With a few exceptions, IIRC. For example, some part of LLVM turns single "bswap"s into an intrinsic llvm.bswap.i<bitsize>() call to help analyses, optimizers and the JIT since they don't generally support inline asm otherwise.
Feb 04 2009
prev sibling next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Tomas Lindquist Olsen wrote:
 Now in Tango there's a bunch of code, like the following (copied from
 tango.math.IEEE.d)
 
 real ldexp(real n, int exp) /* intrinsic */
 {
     version(Really_D_InlineAsm_X86)
     {
         asm {
             fild exp;
             fld n;
             fscale;
             fstp ST(1), ST(0);
         }
     }
     else
     {
         return tango.stdc.math.ldexpl(n, exp);
     }
 }
 
 This code assumes that the value of ST(0) is preserved after the asm
 block ends and that the compiler simply inserts a return instruction
 as appropriate.
 This doesn't work with LLVM. For the function to be valid in codegen,
 we must insert a return instruction in the LLVM IR code after the
 block, and the only choices we have for the value to return is an
 undefined value. This kind of code usually works when the program
 isn't optimized, however, if optimization is enabled, a caller of
 ldexp will most likely notice that the return value is undefined or a
 constant, and so has a lot of freedom to do what it wants. Breaking
 the way the return value is received in the process.
 
 This is almost exactly the same problem I had with naked inline asm,
 and the only fix is to somehow generate an inline asm expression
 (that's what llvm has, not statements like D), that produces the right
 return value. Something a bit like:
 
 return asm { ... }
 
 Since D has no way to express this directly, it means we would have to
 analyze the inline asm and somehow capture the right registers etc.
 This is not something I want to implement right now, if ever...

Is it really that hard? Can't you just detect this case (non-void function without a 'return' at the end but with inline asm inside)? Since the compiler should know the calling convention[1], the register that will contain the return value of the function should be a simple lookup (based on target architecture, cc and return type). Just add that register as an output of the inline asm and return it... It gets a bit trickier with things like ----- if (cpu.hasFeatureX()) asm { ... } else asm { ... } ----- of course, but storing the value of the register in question into a hidden variable and returning its value at the end shouldn't be that hard... In other words, change every inline asm in a qualifying function to add an output of the "return register", store its value into an alloca'd stack slot and load & return it at the end of the function. [1]: Given that LLVM normally handles this, this probably requires an extra lookup table in LDC that needs to be kept up-to-date.
Feb 04 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void 
 function without a 'return' at the end but with inline asm inside)?
 
 Since the compiler should know the calling convention[1], the register 
 that will contain the return value of the function should be a simple 
 lookup (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return it...

dmd doesn't attempt to figure out which register is the return value. It just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.
Feb 04 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Walter Bright wrote:
 Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void 
 function without a 'return' at the end but with inline asm inside)?

 Since the compiler should know the calling convention[1], the register 
 that will contain the return value of the function should be a simple 
 lookup (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return it...

dmd doesn't attempt to figure out which register is the return value. It just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.

That isn't an option for LDC, which is why I suggested another approach.
Feb 05 2009
parent reply Don <nospam nospam.com> writes:
Frits van Bommel wrote:
 Walter Bright wrote:
 Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void 
 function without a 'return' at the end but with inline asm inside)?

 Since the compiler should know the calling convention[1], the 
 register that will contain the return value of the function should be 
 a simple lookup (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return it...

dmd doesn't attempt to figure out which register is the return value. It just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.

That isn't an option for LDC, which is why I suggested another approach.

What's the difference? Walter's approach assumes there's a "return EAX;" at the end of every function returning an int, for example; your approach seems to be to add it.
Feb 05 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Don wrote:
 Frits van Bommel wrote:
 Walter Bright wrote:
 Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void 
 function without a 'return' at the end but with inline asm inside)?

 Since the compiler should know the calling convention[1], the 
 register that will contain the return value of the function should 
 be a simple lookup (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return it...

dmd doesn't attempt to figure out which register is the return value. It just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.

That isn't an option for LDC, which is why I suggested another approach.

What's the difference? Walter's approach assumes there's a "return EAX;" at the end of every function returning an int, for example; your approach seems to be to add it.

His approach depends on DMD directly emitting x86 machine code, so it can just emit 'RET' and be done with it. LDC on the other hand needs to emit LLVM asm, which requires it to specify an explicit return value. My approach is a way to extract that return value from the inline asm, allowing it to emulate DMD behavior within the LLVM IR.
Feb 05 2009
next sibling parent reply Don <nospam nospam.com> writes:
Tomas Lindquist Olsen wrote:
 On Thu, Feb 5, 2009 at 2:42 PM, Frits van Bommel
 <fvbommel remwovexcapss.nl> wrote:
 Don wrote:
 Frits van Bommel wrote:
 Walter Bright wrote:
 Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void
 function without a 'return' at the end but with inline asm inside)?

 Since the compiler should know the calling convention[1], the register
 that will contain the return value of the function should be a simple lookup
 (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return it...

just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.


at the end of every function returning an int, for example; your approach seems to be to add it.

just emit 'RET' and be done with it. LDC on the other hand needs to emit LLVM asm, which requires it to specify an explicit return value. My approach is a way to extract that return value from the inline asm, allowing it to emulate DMD behavior within the LLVM IR.

I had really hoped I didn't have to do something like this, but I can't come up with a better approach. I just hope it actually works when I'm done ... Also I have no idea if code quality is going to be optimal. I imagine people write code like this for efficiency, if LLVM adds extra instructions there is little point in writing code like this for LDC, and we'd want to version things in any case, providing a true naked version for LDC. In this case I'm not sure it's worth it to actually do this work in the first place.

The only reason a function like this isn't written as naked, is so that it has a chance to be inlined. If that's impossible with this syntax on all compilers, there doesn't seem much point - it might as well be illegal. If D provided a "return EAX,EDX;" fake asm instruction, would inlining be possible?
Feb 05 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake

Why do people keep performing s/s/z/ on my name? :(
Feb 06 2009
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Simen Kjaeraas wrote:
 On Fri, 06 Feb 2009 14:50:58 +0100, Frits van Bommel
 <fvbommel remwovexcapss.nl> wrote:
 
 Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake

Why do people keep performing s/s/z/ on my name? :(

Clearly, changing your name iz the eaziezt zolution. -- Simen

The European Commission have just announced an agreement whereby English will be the official language of the EU, rather than German, which was the other possibility. As part of the negotiations, Her Majesty's government conceded that English spelling had some room for improvement and has accepted a five year phase in plan that would be known as "EuroEnglish". In the first year, "s" will replace the soft "c". Sertainly, this will make the sivil servants jump for joy. The hard "c" will be dropped in favour of the "k". This should klear up konfusion and keyboards kan have 1 less letter. There will be growing publik enthusiasm in the sekond year, when the troublesome "ph" will be replaced with the "f". This will make words like "fotograf" 20% shorter. In the third year, publik akseptanse of the new spelling kan be expekted to reach the stage where more komplikated changes are possible. Governments will enkorage the removal of double letters, which have always ben a deterent to akurate speling. Also, al wil agre that the horible mes of the silent "e"s in the language is disgraseful, and they should go away. By the 4th year, peopl wil be reseptiv to steps such as replasing "th" with "z" and "w" with "v". During ze fifz year, ze unesesary "o" kan be dropd from vords kontaining "ou" and similar changes vud of kors be aplid to ozer kombinations of leters. After zis fifz year, ve vil hav a realy sensibl riten styl. Zer vil be no mor trubls or difikultis and evrivun vil find it ezi to understand each ozer ZE DREAM VIL FINALI KUM TRU! <http://lib.ru/ENGLISH/rekonstr.txt>
Feb 06 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Frits van Bommel wrote:
 Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake

Why do people keep performing s/s/z/ on my name? :(

You're in luck. They tend to do s/ndrei/lex/ on mine :o). Andrei
Feb 06 2009
next sibling parent Christopher Wright <dhasenan gmail.com> writes:
Andrei Alexandrescu wrote:
 Frits van Bommel wrote:
 Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake

Why do people keep performing s/s/z/ on my name? :(

You're in luck. They tend to do s/ndrei/lex/ on mine :o). Andrei

Has anyone s/Alexandrescu/Rublev/ on you?
Feb 06 2009
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Frits van Bommel wrote:
 Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake

Why do people keep performing s/s/z/ on my name? :(

You're in luck. They tend to do s/ndrei/lex/ on mine :o).

They do s/walter/nerd/ on mine.
Feb 06 2009
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Frits van Bommel wrote:
 His approach depends on DMD directly emitting x86 machine code, so it 
 can just emit 'RET' and be done with it.
 
 LDC on the other hand needs to emit LLVM asm, which requires it to 
 specify an explicit return value. My approach is a way to extract that 
 return value from the inline asm, allowing it to emulate DMD behavior 
 within the LLVM IR.

Ok, so why not, for a function that returns an int, simply have the compiler silently tack on the LLVM equivalent of "return EAX"?
Feb 05 2009
next sibling parent Don <nospam nospam.com> writes:
Walter Bright wrote:
 Frits van Bommel wrote:
 His approach depends on DMD directly emitting x86 machine code, so it 
 can just emit 'RET' and be done with it.

 LDC on the other hand needs to emit LLVM asm, which requires it to 
 specify an explicit return value. My approach is a way to extract that 
 return value from the inline asm, allowing it to emulate DMD behavior 
 within the LLVM IR.

Ok, so why not, for a function that returns an int, simply have the compiler silently tack on the LLVM equivalent of "return EAX"?

I've created bugzilla 2648 for one of the issues -- struct returns are a particular nuisance. (I think you may encounter this on DMD-Mac).
Feb 06 2009
prev sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Walter Bright wrote:
 Frits van Bommel wrote:
 His approach depends on DMD directly emitting x86 machine code, so it 
 can just emit 'RET' and be done with it.

 LDC on the other hand needs to emit LLVM asm, which requires it to 
 specify an explicit return value. My approach is a way to extract that 
 return value from the inline asm, allowing it to emulate DMD behavior 
 within the LLVM IR.

Ok, so why not, for a function that returns an int, simply have the compiler silently tack on the LLVM equivalent of "return EAX"?

Because LLVM doesn't allow specification of hardware registers in the IR. Everything must be a virtual register. The way I proposed LDC implement this is basically to tell the inline-asm IR "Put EAX in %virtual-eax" and to then return that register. It will in all likelyhood have the same effect though, assuming a tiny bit of optimization and a minimally competent register allocator.
Feb 06 2009
prev sibling parent reply "Lionello Lunesu" <lionello lunesu.remove.com> writes:
"Frits van Bommel" <fvbommel REMwOVExCAPSs.nl> wrote in message 
news:gmeqbr$1377$1 digitalmars.com...
 LDC on the other hand needs to emit LLVM asm, which requires it to specify 
 an explicit return value. My approach is a way to extract that return 
 value from the inline asm, allowing it to emulate DMD behavior within the 
 LLVM IR.

Sorry, perhaps I'm missing something: Why should you have to deduct that from the asm? Doesn't the function prototype give enough information? If the function returns "int/uint/...", assume "eax"; if it returns "float/double/..." assume "st(0)", etc.... L.
Feb 05 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Lionello Lunesu wrote:
 
 "Frits van Bommel" <fvbommel REMwOVExCAPSs.nl> wrote in message 
 news:gmeqbr$1377$1 digitalmars.com...
 LDC on the other hand needs to emit LLVM asm, which requires it to 
 specify an explicit return value. My approach is a way to extract that 
 return value from the inline asm, allowing it to emulate DMD behavior 
 within the LLVM IR.

Sorry, perhaps I'm missing something: Why should you have to deduct that from the asm? Doesn't the function prototype give enough information? If the function returns "int/uint/...", assume "eax"; if it returns "float/double/..." assume "st(0)", etc....

LLVM IR doesn't know about hardware registers, except when dealing with inline asm. So if you need to know the value a hardware register has at the end of some inline asm, you need to tell that asm to "return" it into a virtual register that you can actually use in regular IR (such as returning it from a function).
Feb 06 2009
parent reply Chad J <gamerchad __spam.is.bad__gmail.com> writes:
Frits van Bommel wrote:
 Lionello Lunesu wrote:
 "Frits van Bommel" <fvbommel REMwOVExCAPSs.nl> wrote in message
 news:gmeqbr$1377$1 digitalmars.com...
 LDC on the other hand needs to emit LLVM asm, which requires it to
 specify an explicit return value. My approach is a way to extract
 that return value from the inline asm, allowing it to emulate DMD
 behavior within the LLVM IR.

Sorry, perhaps I'm missing something: Why should you have to deduct that from the asm? Doesn't the function prototype give enough information? If the function returns "int/uint/...", assume "eax"; if it returns "float/double/..." assume "st(0)", etc....

LLVM IR doesn't know about hardware registers, except when dealing with inline asm. So if you need to know the value a hardware register has at the end of some inline asm, you need to tell that asm to "return" it into a virtual register that you can actually use in regular IR (such as returning it from a function).

I think I might just sortof maybe kinda understand the problem now. So I take some of many options and consider the consequences: - Don't put any return statement into the IR. EAX/st(0)/etc has the return value so don't bother. Consequence: LLVM errors, there MUST be a return value represented in IR. - Put some stupid return statement into the IR, like return 0. Consequence: Programmer places result into EAX. LLVM generated code places 0 into EAX. Function always returns 0. Whoops. - Mark the function as returning void. Now we don't need to put a return into the IR. Consequence: User writes something like auto bar = foo();. foo contains inline ASM. But what is assigned to bar in the IR code? There is no way to tell the IR to assign a value from EAX/st(0)/whatever. EAX is set to the correct value when foo() returns, but there is no way to USE it, so it just floats around uselessly until it is overwritten by something else. - Casting fun. So the function returns an integer. Well, return a floating point value NaN instead. EAX still gets set to the correct value due to the inline ASM. Consequences: Similar problem as before: int bar = foo(); violates type safety since we've rewritten foo() so that LLVM thinks it returns a float. I don't know if the IR has any loopholes, but if it does then maybe there is some snowball's chance in hell of making it work anyways. I hope I understand this correctly. It seems like the problem at hand is difficult to communicate and thus stomps useful dialog :(
Feb 06 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Chad J wrote:
 Frits van Bommel wrote:
 Lionello Lunesu wrote:
 "Frits van Bommel" <fvbommel REMwOVExCAPSs.nl> wrote in message
 news:gmeqbr$1377$1 digitalmars.com...
 LDC on the other hand needs to emit LLVM asm, which requires it to
 specify an explicit return value. My approach is a way to extract
 that return value from the inline asm, allowing it to emulate DMD
 behavior within the LLVM IR.

that from the asm? Doesn't the function prototype give enough information? If the function returns "int/uint/...", assume "eax"; if it returns "float/double/..." assume "st(0)", etc....

inline asm. So if you need to know the value a hardware register has at the end of some inline asm, you need to tell that asm to "return" it into a virtual register that you can actually use in regular IR (such as returning it from a function).

I think I might just sortof maybe kinda understand the problem now.

 
 I hope I understand this correctly.  It seems like the problem at hand
 is difficult to communicate and thus stomps useful dialog :(

That seems to be a pretty good summary of what's wrong with most of the alternatives, yes. You missed one though, that Lindquist mentioned: they could also return a special "undefined value" (which LLVM supports, and means "I don't care what it is") and the return value would (in practice) be whatever was in the relevant register at the time *if no optimizations are run*. The problem is that optimizations can see "Hey, that function only ever returns one value (or returns either a normal value or an undefined value)" and change all places where the return value is used with that one value. This would break the asm + ret undef, yet be a perfectly valid optimization according the semantics of LLVM IR. Luckily, inline asm is treated as a function literal in LLVM, and it can return one or more values to the caller if the constraints string specifies which registers will contain them. So if LDC just specifies (e.g.) EAX/EDX:EAX/ST(0) to contain the result of the inline asm, it can get the value in the register(s) in question as an LLVM value that can be returned without any problem. The only really tricky bits are (a) figuring out how the constraints string works, exactly[1] and (b) figuring out which register(s) the return value should be in. [1]: There's no documentation that I'm aware of (unless it was added very recently) other than the LLVM(-GCC) source and llvm-gcc output when compiling code containing extended asm (which is similar but not identical to LLVM-style inline asm, and documented pretty well at http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html). The similarity is not an accident as the main requirement in the inline asm design for LLVM was probably "support extended asm in llvm-gcc" :).
Feb 07 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Frits van Bommel wrote:
 Luckily, inline asm is treated as a function literal in LLVM, and it can 
 return one or more values to the caller if the constraints string 
 specifies which registers will contain them. So if LDC just specifies 
 (e.g.) EAX/EDX:EAX/ST(0) to contain the result of the inline asm, it can 
 get the value in the register(s) in question as an LLVM value that can 
 be returned without any problem.
 The only really tricky bits are (a) figuring out how the constraints 
 string works, exactly[1] and (b) figuring out which register(s) the 
 return value should be in.

You understand the LLVM better than anyone else here <g>, so I suggest that you pick what you think will work best, and leave it at that for now. I don't think there's a good reason to get too stuck on this. If a better solution emerges during testing, it can be corrected. For example, in my work on the OSX version of dmd, it turns out that OSX requires that the stack be aligned on 16 bytes whenever a function gets called. (If it isn't so aligned, the program crashes with a misaligned stack fault exception.) This naturally affects all 'naked' inline assembly, as well as all function calls made from inline assembly. I don't think there's any hope for the compiler automatically fixing this, and more importantly, the compiler *should not* automatically fix it. When you use inline assembler, you've got to expect that it won't be very portable. I've gone through and corrected all the inline assembler in Phobos for this. There isn't much of it, and the fixes aren't difficult. It just comes with the territory of using inline assembler. Another ABI difference is that on windows, reals take up 10 bytes. On linux, it's 12. On OSX, it's 16. The hardware operands still take up only 10, the rest is padding. What is expected from the inline assembler, however, is that the syntax of the instructions remains the same, and it looks like you've got that one nailed.
Feb 07 2009
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Walter Bright wrote:
 Frits van Bommel wrote:
 Luckily, inline asm is treated as a function literal in LLVM, and it 
 can return one or more values to the caller if the constraints string 
 specifies which registers will contain them. So if LDC just specifies 
 (e.g.) EAX/EDX:EAX/ST(0) to contain the result of the inline asm, it 
 can get the value in the register(s) in question as an LLVM value that 
 can be returned without any problem.
 The only really tricky bits are (a) figuring out how the constraints 
 string works, exactly[1] and (b) figuring out which register(s) the 
 return value should be in.

You understand the LLVM better than anyone else here <g>, so I suggest

I just happened to remember the broad strokes of how inline asm works in LLVM[1], and saw the relevance to this discussion. I'm not sure if that qualifies as knowing LLVM as a whole better than anyone else here. [1]: i.e. "like gcc extended asm, but with different syntax"
 that you pick what you think will work best, and leave it at that for 
 now. I don't think there's a good reason to get too stuck on this. If a 
 better solution emerges during testing, it can be corrected.

Lindquist says (on IRC) that he has already implemented this on his local machine and it's working quite nicely (for x86). He hasn't pushed it to the dsource repository yet; it requires people compiling from there to update their LLVM since apparently with LLVM 2.4 the EDX:EAX 64-bit int constraint doesn't work. Updating to the 2.5 branch fixes it though, so it looks like LDC will be requiring v2.5 soon. Porting to other architectures shouldn't be too hard either. The code that generates the constraint string for x86 is pretty clean and short so adding (e.g.) x86-64 support should only require a bit of copy+paste+edit after some research into calling conventions.
Feb 07 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 05 Feb 2009 16:21:54 +0300, Don <nospam nospam.com> wrote:

 Frits van Bommel wrote:
 Walter Bright wrote:
 Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void  
 function without a 'return' at the end but with inline asm inside)?

 Since the compiler should know the calling convention[1], the  
 register that will contain the return value of the function should be  
 a simple lookup (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return it...

dmd doesn't attempt to figure out which register is the return value. It just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.

approach.

What's the difference? Walter's approach assumes there's a "return EAX;" at the end of every function returning an int, for example; your approach seems to be to add it.

FWIW, Microsoft C++ Compiler does the same.
Feb 05 2009
prev sibling next sibling parent Tomas Lindquist Olsen <tomas.l.olsen gmail.com> writes:
On Thu, Feb 5, 2009 at 2:42 PM, Frits van Bommel
<fvbommel remwovexcapss.nl> wrote:
 Don wrote:
 Frits van Bommel wrote:
 Walter Bright wrote:
 Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void
 function without a 'return' at the end but with inline asm inside)?

 Since the compiler should know the calling convention[1], the register
 that will contain the return value of the function should be a simple lookup
 (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return it...

dmd doesn't attempt to figure out which register is the return value. It just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.

That isn't an option for LDC, which is why I suggested another approach.

What's the difference? Walter's approach assumes there's a "return EAX;" at the end of every function returning an int, for example; your approach seems to be to add it.

His approach depends on DMD directly emitting x86 machine code, so it can just emit 'RET' and be done with it. LDC on the other hand needs to emit LLVM asm, which requires it to specify an explicit return value. My approach is a way to extract that return value from the inline asm, allowing it to emulate DMD behavior within the LLVM IR.

I had really hoped I didn't have to do something like this, but I can't come up with a better approach. I just hope it actually works when I'm done ... Also I have no idea if code quality is going to be optimal. I imagine people write code like this for efficiency, if LLVM adds extra instructions there is little point in writing code like this for LDC, and we'd want to version things in any case, providing a true naked version for LDC. In this case I'm not sure it's worth it to actually do this work in the first place.
Feb 05 2009
prev sibling next sibling parent Tomas Lindquist Olsen <tomas.l.olsen gmail.com> writes:
On Thu, Feb 5, 2009 at 5:46 PM, Don <nospam nospam.com> wrote:
 Tomas Lindquist Olsen wrote:
 On Thu, Feb 5, 2009 at 2:42 PM, Frits van Bommel
 <fvbommel remwovexcapss.nl> wrote:
 Don wrote:
 Frits van Bommel wrote:
 Walter Bright wrote:
 Frits van Bommel wrote:
 Is it really that hard? Can't you just detect this case (non-void
 function without a 'return' at the end but with inline asm inside)?

 Since the compiler should know the calling convention[1], the
 register
 that will contain the return value of the function should be a simple
 lookup
 (based on target architecture, cc and return type).
 Just add that register as an output of the inline asm and return
 it...

dmd doesn't attempt to figure out which register is the return value. It just assumes that the registers specified by the ABI for the function's return type have the proper return value in them.

That isn't an option for LDC, which is why I suggested another approach.

What's the difference? Walter's approach assumes there's a "return EAX;" at the end of every function returning an int, for example; your approach seems to be to add it.

His approach depends on DMD directly emitting x86 machine code, so it can just emit 'RET' and be done with it. LDC on the other hand needs to emit LLVM asm, which requires it to specify an explicit return value. My approach is a way to extract that return value from the inline asm, allowing it to emulate DMD behavior within the LLVM IR.

I had really hoped I didn't have to do something like this, but I can't come up with a better approach. I just hope it actually works when I'm done ... Also I have no idea if code quality is going to be optimal. I imagine people write code like this for efficiency, if LLVM adds extra instructions there is little point in writing code like this for LDC, and we'd want to version things in any case, providing a true naked version for LDC. In this case I'm not sure it's worth it to actually do this work in the first place.

The only reason a function like this isn't written as naked, is so that it has a chance to be inlined. If that's impossible with this syntax on all compilers, there doesn't seem much point - it might as well be illegal. If D provided a "return EAX,EDX;" fake asm instruction, would inlining be possible?

The approach Fritz mentions should still allow inlining. Having a fake asm instruction like that could make it a bit simpler to implement this though, since it would be up to the programmer to know the ABI, not our asm translator frontend. Otherwise it seems to me to be the same thing really. At the moment, LDC won't inline anything containing inline asm, but this restriction could be loosened a bit. The reason we disable inlining right now, is that if the asm contains labels, and the function is inlined, LLVM doesn't rewrite the labels, and thus you might get conflicting labels when you get to assembling. I asked on the LLVM IRC channel about this, but it's probably not going to be fixed. The argument was that GCC has the same restriction for extended inline asm expressions, if you use labels, you must also manually mark the function with a never-inline function attribute. This might change when LLVM gets its own assembler I guess.. Another thing is that if inlining is the main reason for functions like these, perhaps it would be better to somehow get this optimization into LLVM itself? There is already a pass that tries to lower common C library function calls... Yet another thing about inlining with LDC is that, currently the DMD inliner is disabled. Some of the AST rewrites it does broke our codegen last time I tried, and we simply haven't tried turning it back on since. This means that LDC will only inline when it has access to a LLVM IR representation of the function, this basically means that only functions from the same module will be inlined, or template functions - which are always emitted. This is going to change once we get proper LTO support into LDC, and for now people can still compile to .bc files instead of .o, and link manually using LLVM tools to get this feature, so it's not that critical imho. I guess I'll investigate how much LLVM can help with providing me the register details to implement something that works automagically... It just feels wrong to have to duplicate all that information... </end rant>
Feb 05 2009
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Fri, 06 Feb 2009 14:50:58 +0100, Frits van Bommel
<fvbommel remwovexcapss.nl> wrote:

 Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake

Why do people keep performing s/s/z/ on my name? :(

Clearly, changing your name iz the eaziezt zolution. -- Simen
Feb 06 2009
prev sibling next sibling parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Fri, Feb 6, 2009 at 8:50 AM, Frits van Bommel
<fvbommel remwovexcapss.nl> wrote:
 Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake

Why do people keep performing s/s/z/ on my name? :(

Why do people keep performing s/tt/t/ on _my_ name?
Feb 06 2009
parent John Reimer <terminal.node gmail.com> writes:
Hello Jarrett,

 On Fri, Feb 6, 2009 at 8:50 AM, Frits van Bommel
 <fvbommel remwovexcapss.nl> wrote:
 Tomas Lindquist Olsen wrote:
 
 The approach Fritz mentions should still allow inlining. Having a
 fake
 



Oh come now, Jaret. I distinctly remember forgetting an 'r', so I'm not part of /that/ group! :D The extra 'r' and 't' are superfluous anyway. They don't make any useful sounds. Although, there is the slight possibility that they affect syllable stress, I suppose. ;D -JJR
Feb 06 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 06 Feb 2009 20:23:19 +0300, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 Frits van Bommel wrote:
 Tomas Lindquist Olsen wrote:
 The approach Fritz mentions should still allow inlining. Having a fake


You're in luck. They tend to do s/ndrei/lex/ on mine :o). Andrei

*LOL*
Feb 06 2009